Problems with estimating OR conditions, IS NULL on LEFT JOINs

Started by Tomas Vondraover 2 years ago13 messages

tomas.vondra@enterprisedb.com

over 2 years ago

Hi,

I ran into a pretty terrible case of LEFT JOIN estimate, resulting in
pretty arbitrary underestimate. The query is pretty trivial, nothing
overly complex, and the more I think about it the more I think this is
a fairly fundamental flaw in how we estimate this type of joins.

Imagine you have two trivial tables:

CREATE TABLE large (id INT, a INT);
INSERT INTO large SELECT i, i FROM generate_series(1,1000000) s(i);

CREATE TABLE small (id INT, b INT);
INSERT INTO small SELECT i, i FROM generate_series(1,100) s(i);

The business meaning may be that "large" stores orders and "small" is
for events related to tiny fraction of the large table (e.g. returns).
And let's do a couple simple LEFT JOIN queries, adding conditions to it.

Let's start with no condition at all:

EXPLAIN ANALYZE
SELECT * FROM large LEFT JOIN small ON (large.id = small.id)

QUERY PLAN
----------------------------------------------------------------------
Hash Left Join (cost=3.25..18179.25 rows=1000000 width=16)
(actual time=0.069..550.290 rows=1000000 loops=1)
Hash Cond: (large.id = small.id)
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.010..174.056 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8) (actual time=0.052...
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8) ...
Planning Time: 0.291 ms
Execution Time: 663.551 ms
(8 rows)

So great, this estimate is perfect. Now, let's add IS NULL condition on
the small table, to find rows without a match (e.g. orders that were not
returned):

EXPLAIN ANALYZE
SELECT * FROM large LEFT JOIN small ON (large.id = small.id)
WHERE (small.id IS NULL);

QUERY PLAN
----------------------------------------------------------------------
Hash Anti Join (cost=3.25..27052.36 rows=999900 width=16)
(actual time=0.071..544.568 rows=999900 loops=1)
Hash Cond: (large.id = small.id)
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.015..166.019 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8) (actual time=0.051...
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8) ...
Planning Time: 0.260 ms
Execution Time: 658.379 ms
(8 rows)

Also very accurate, great! Now let's do a condition on the large table
instead, filtering some the rows:

EXPLAIN ANALYZE
SELECT * FROM large LEFT JOIN small ON (large.id = small.id)
WHERE (large.a IN (1000, 2000, 3000, 4000, 5000));

QUERY PLAN
----------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..20684.75 rows=5 width=16)
(actual time=0.957..127.376 rows=5 loops=1)
Join Filter: (large.id = small.id)
Rows Removed by Join Filter: 500
-> Seq Scan on large (cost=0.00..20675.00 rows=5 width=8)
(actual time=0.878..127.171 rows=5 loops=1)
Filter: (a = ANY ('{1000,2000,3000,4000,5000}'::integer[]))
Rows Removed by Filter: 999995
-> Materialize (cost=0.00..2.50 rows=100 width=8) ...
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8) ...
Planning Time: 0.223 ms
Execution Time: 127.407 ms
(10 rows)

Also great estimate! Surely, if we do both conditions with OR, we'll get
a decent estimate too?

EXPLAIN ANALYZE
SELECT * FROM large LEFT JOIN small ON (large.id = small.id)
WHERE (small.id IS NULL)
OR (large.a IN (1000, 2000, 3000, 4000, 5000));

QUERY PLAN
----------------------------------------------------------------------
Hash Left Join (cost=3.25..18179.88 rows=5 width=16)
(actual time=0.073..580.827 rows=999900 loops=1)
Hash Cond: (large.id = small.id)
Filter: ((small.id IS NULL) OR
(large.a = ANY ('{1000,2000,3000,4000,5000}'::integer[])))
Rows Removed by Filter: 100
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.015..166.809 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8) (actual time=0.052...
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8) ...
Planning Time: 0.309 ms
Execution Time: 694.427 ms
(10 rows)

Well, bummer! This is pretty surprising, because if we know that clause
A produces estimate 1M and clause B estimates as 5, then it's expected
that (A OR B) should be estimated as something >= max(1M, 5). For users
running this, this has to be really surprising.

It's also quite serious, because with underestimates like this the
planner is likely to pick nestloops for additional joins, and we all
know how that performs for millions of rows ...

So, how does this happen? Well, the simple reason is that joins are
estimated by applying selectivities for all the clauses on a cartesian
product. So we calculate product (100 * 1M), and then apply selectivity
for the join condition, and the two single-table clauses.

The problem is that the selectivity for "IS NULL" is estimated using the
table-level statistics. But the LEFT JOIN entirely breaks the idea that
the null_frac has anything to do with NULLs in the join result. Because
the join result is not a subset of cartesian product - it's a superset.
Even if the small table has no NULLs, the join can have them - that's
the whole point of outer joins.

When there's only the IS NULL condition, we actually recognize this as
a special case and treat the join as antijoin (Hash Anti Join), and that
also fixes the estimates - antijoins do handle this fine. But as soon as
you change the condition a bit and do the IS NULL check on the other
column of the table, it goes wrong too:

EXPLAIN ANALYZE
SELECT * FROM large LEFT JOIN small ON (large.id = small.id)
WHERE (small.b IS NULL);

QUERY PLAN
----------------------------------------------------------------------
Hash Left Join (cost=3.25..18179.25 rows=1 width=16)
(actual time=0.311..3110.298 rows=999900 loops=1)
Hash Cond: (large.id = small.id)
Filter: (small.b IS NULL)
Rows Removed by Filter: 100
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.014..1032.497 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8) (actual time=0.287...
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8) ...
Planning Time: 0.105 ms
Execution Time: 4083.634 ms
(10 rows)

I'd bet most users would be rather surprised to learn this subtle change
makes a difference.

I wonder how to improve this, say by adjusting the IS NULL selectivity
when we know to operate on the outer side of the join. We're able to
do this for antijoins, so maybe we could do that here, somehow?

Unfortunately the important things (join type detection, IS NULL clause
estimation) happen pretty far away, but maybe we could pass that info
about fraction of NULLs introduced by he join to the nulltestsel, and
use it there (essentially by doing null_frac + join_null_frac). And
maybe we could do something like that for NULL tests on other columns
from the outer side ...

The other thing that might be beneficial is calculating boundaries for
the estimate. In this case we're capable of estimating the join size for
individual conditions, and then we could make ensure the final estimate
for the "a OR b" join is >= max of these cardinalities.

Of course, that might be expensive if we have to redo some of the join
planning/estimates for each individual condition (otherwise might not
notice the antijoin transformation and stuff like that).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tom Lane

tgl@sss.pgh.pa.us

over 2 years ago

In reply to: Tomas Vondra (#1)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

The problem is that the selectivity for "IS NULL" is estimated using the
table-level statistics. But the LEFT JOIN entirely breaks the idea that
the null_frac has anything to do with NULLs in the join result.

Right.

I wonder how to improve this, say by adjusting the IS NULL selectivity
when we know to operate on the outer side of the join. We're able to
do this for antijoins, so maybe we could do that here, somehow?

This mess is part of the long-term plan around the work I've been doing
on outer-join-aware Vars. We now have infrastructure that can let
the estimator routines see "oh, this Var isn't directly from a scan
of its table, it's been passed through a potentially-nulling outer
join --- and I can see which one". I don't have more than vague ideas
about what happens next, but that is clearly an essential step on the
road to doing better.

regards, tom lane

Tomas Vondra

tomas.vondra@enterprisedb.com

over 2 years ago

In reply to: Tom Lane (#2)

1 attachment(s)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

On 6/24/23 02:08, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

The problem is that the selectivity for "IS NULL" is estimated using the
table-level statistics. But the LEFT JOIN entirely breaks the idea that
the null_frac has anything to do with NULLs in the join result.

Right.

I wonder how to improve this, say by adjusting the IS NULL selectivity
when we know to operate on the outer side of the join. We're able to
do this for antijoins, so maybe we could do that here, somehow?

This mess is part of the long-term plan around the work I've been doing
on outer-join-aware Vars. We now have infrastructure that can let
the estimator routines see "oh, this Var isn't directly from a scan
of its table, it's been passed through a potentially-nulling outer
join --- and I can see which one". I don't have more than vague ideas
about what happens next, but that is clearly an essential step on the
road to doing better.

I was wondering if that work on outer-join-aware Vars could help with
this, but I wasn't following it very closely. I agree the ability to
check if the Var could be NULL due to an outer join seems useful, as it
says whether applying raw attribute statistics makes sense or not.

I was thinking about what to do for the case when that's not possible,
i.e. when the Var refers to nullable side of the join. Knowing that this
is happening is clearly not enough - we need to know how many new NULLs
are "injected" into the join result, and "communicate" that to the
estimation routines.

Attached is a very ugly experimental patch doing that, and with it the
estimate changes to this:

QUERY PLAN
----------------------------------------------------------------------
Hash Left Join (cost=3.25..18179.88 rows=999900 width=16)
(actual time=0.528..596.151 rows=999900 loops=1)
Hash Cond: (large.id = small.id)
Filter: ((small.id IS NULL) OR
(large.a = ANY ('{1000,2000,3000,4000,5000}'::integer[])))
Rows Removed by Filter: 100
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.069..176.138 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8)
(actual time=0.371..0.373 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.032..0.146 rows=100 loops=1)
Planning Time: 3.845 ms
Execution Time: 712.405 ms
(10 rows)

Seems nice, but. The patch is pretty ugly, I don't claim it works for
other queries or that this is exactly what we should do. It calculates
"unmatched frequency" next to eqjoinsel_inner, stashes that info into
sjinfo and the estimator (nulltestsel) then uses that to adjust the
nullfrac it gets from the statistics.

The good thing is this helps even for IS NULL checks on non-join-key
columns (where we don't switch to an antijoin), but there's a couple
things that I dislike ...

1) It's not restricted to outer joins or anything like that (this is
mostly just my laziness / interest in one particular query, but also
something the outer-join-aware patch might help with).

2) We probably don't want to pass this kind of information through
sjinfo. That was the simplest thing for an experimental patch, but I
suspect it's not the only piece of information we may need to pass to
the lower levels of estimation code.

3) I kinda doubt we actually want to move this responsibility (to
consider fraction of unmatched rows) to the low-level estimation
routines (e.g. nulltestsel and various others). AFAICS this just
"introduces NULLs" into the relation, so maybe we could "adjust" the
attribute statistics (in examine_variable?) by inflating null_frac and
modifying the other frequencies in MCV/histogram.

4) But I'm not sure we actually want to do that in these low-level
selectivity functions. The outer join essentially produces output with
two subsets - one with matches on the outer side, one without them. But
the side without matches has NULLs in all columns. In a way, we know
exactly how are these columns correlated - if we do the usual estimation
(even with the null_frac adjusted), we just throw this information away.
And when there's a lot of rows without a match, that seems bad.

So maybe we should split the join estimate into two parts, one for each
subset of the join result. One for the rows with a match (and then we
can just do what we do now, with the attribute stats we already have).
And one for the "unmatched part" where we know the values on the outer
side are NULL (and then we can easily "fake" stats with null_frac=1.0).

I really hope what I just wrote makes at least a little bit of sense.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

left-join-estimates.patchtext/x-patch; charset=UTF-8; name=left-join-estimates.patchDownload

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c4fcd0076e..a035e213a0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -154,6 +154,13 @@ static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
 							  AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 							  Form_pg_statistic stats1, Form_pg_statistic stats2,
 							  bool have_mcvs1, bool have_mcvs2);
+static double eqjoinsel_unmatch_left(Oid opfuncoid, Oid collation,
+									 VariableStatData *vardata1, VariableStatData *vardata2,
+									 double nd1, double nd2,
+									 bool isdefault1, bool isdefault2,
+									 AttStatsSlot *sslot1, AttStatsSlot *sslot2,
+									 Form_pg_statistic stats1, Form_pg_statistic stats2,
+									 bool have_mcvs1, bool have_mcvs2);
 static double eqjoinsel_semi(Oid opfuncoid, Oid collation,
 							 VariableStatData *vardata1, VariableStatData *vardata2,
 							 double nd1, double nd2,
@@ -1710,6 +1717,9 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
 		freq_null = stats->stanullfrac;
 
+		if (sjinfo)
+			freq_null = freq_null + sjinfo->unmatched_frac - freq_null * sjinfo->unmatched_frac;
+
 		switch (nulltesttype)
 		{
 			case IS_NULL:
@@ -2249,6 +2259,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
 	Oid			collation = PG_GET_COLLATION();
 	double		selec;
 	double		selec_inner;
+	double		unmatched_frac;
 	VariableStatData vardata1;
 	VariableStatData vardata2;
 	double		nd1;
@@ -2321,6 +2332,26 @@ eqjoinsel(PG_FUNCTION_ARGS)
 								  stats1, stats2,
 								  have_mcvs1, have_mcvs2);
 
+	/*
+	 * calculate fraction of right without of matching row on left
+	 *
+	 * FIXME Should be restricted to JOIN_LEFT, we should have similar logic
+	 * for JOIN_FULL.
+	 *
+	 * XXX Probably should calculate unmatched as fraction of the join result,
+	 * not of the relation on the right (because the matched part can have more
+	 * matches per row and thus grow). Not sure. Depends on how it's used later.
+	 */
+	unmatched_frac = eqjoinsel_unmatch_left(opfuncoid, collation,
+											&vardata1, &vardata2,
+											nd1, nd2,
+											isdefault1, isdefault2,
+											&sslot1, &sslot2,
+											stats1, stats2,
+											have_mcvs1, have_mcvs2);
+
+	sjinfo->unmatched_frac = unmatched_frac;
+
 	switch (sjinfo->jointype)
 	{
 		case JOIN_INNER:
@@ -2590,6 +2621,117 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 	return selec;
 }
 
+/*
+ * eqjoinsel_unmatch_left
+ *
+ * XXX Mostly copy paste of eqjoinsel_inner, but calculates unmatched fraction
+ * of the relation on the left. See eqjoinsel_inner for more comments.
+ *
+ * XXX Maybe we could/should integrate this into eqjoinsel_inner, so that we
+ * don't have to walk the MCV lists twice?
+ */
+static double
+eqjoinsel_unmatch_left(Oid opfuncoid, Oid collation,
+					   VariableStatData *vardata1, VariableStatData *vardata2,
+					   double nd1, double nd2,
+					   bool isdefault1, bool isdefault2,
+					   AttStatsSlot *sslot1, AttStatsSlot *sslot2,
+					   Form_pg_statistic stats1, Form_pg_statistic stats2,
+					   bool have_mcvs1, bool have_mcvs2)
+{
+	double		unmatchfreq;
+
+	if (have_mcvs1 && have_mcvs2)
+	{
+		LOCAL_FCINFO(fcinfo, 2);
+		FmgrInfo	eqproc;
+		bool	   *hasmatch1;
+		bool	   *hasmatch2;
+		double		matchprodfreq,
+					matchfreq1,
+					unmatchfreq1;
+		int			i,
+					nmatches;
+
+		fmgr_info(opfuncoid, &eqproc);
+
+		/*
+		 * Save a few cycles by setting up the fcinfo struct just once. Using
+		 * FunctionCallInvoke directly also avoids failure if the eqproc
+		 * returns NULL, though really equality functions should never do
+		 * that.
+		 */
+		InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+								 NULL, NULL);
+		fcinfo->args[0].isnull = false;
+		fcinfo->args[1].isnull = false;
+
+		hasmatch1 = (bool *) palloc0(sslot1->nvalues * sizeof(bool));
+		hasmatch2 = (bool *) palloc0(sslot2->nvalues * sizeof(bool));
+
+		/*
+		 * Note we assume that each MCV will match at most one member of the
+		 * other MCV list.  If the operator isn't really equality, there could
+		 * be multiple matches --- but we don't look for them, both for speed
+		 * and because the math wouldn't add up...
+		 */
+		matchprodfreq = 0.0;
+		nmatches = 0;
+		for (i = 0; i < sslot1->nvalues; i++)
+		{
+			int			j;
+
+			fcinfo->args[0].value = sslot1->values[i];
+
+			for (j = 0; j < sslot2->nvalues; j++)
+			{
+				Datum		fresult;
+
+				if (hasmatch2[j])
+					continue;
+				fcinfo->args[1].value = sslot2->values[j];
+				fcinfo->isnull = false;
+				fresult = FunctionCallInvoke(fcinfo);
+				if (!fcinfo->isnull && DatumGetBool(fresult))
+				{
+					hasmatch1[i] = hasmatch2[j] = true;
+					matchprodfreq += sslot1->numbers[i] * sslot2->numbers[j];
+					nmatches++;
+					break;
+				}
+			}
+		}
+		CLAMP_PROBABILITY(matchprodfreq);
+		/* Sum up frequencies of matched and unmatched MCVs */
+		matchfreq1 = unmatchfreq1 = 0.0;
+		for (i = 0; i < sslot1->nvalues; i++)
+		{
+			if (hasmatch1[i])
+				matchfreq1 += sslot1->numbers[i];
+			else
+				unmatchfreq1 += sslot1->numbers[i];
+		}
+		CLAMP_PROBABILITY(matchfreq1);
+		CLAMP_PROBABILITY(unmatchfreq1);
+
+		unmatchfreq = unmatchfreq1;
+	}
+	else
+	{
+		/*
+		 * XXX Should this look at nullfrac on either side? Probably depends on
+		 * if we're calculating fraction of NULLs or fraction of unmatched rows.
+		 */
+		// unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
+		if (nd1 > nd2)
+			unmatchfreq = (nd1 - nd2) * 1.0 / nd1;
+		else
+			unmatchfreq = 0.0;
+	}
+
+	return unmatchfreq;
+}
+
 /*
  * eqjoinsel_semi --- eqjoinsel for semi join
  *
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c17b53f7ad..6bc63e648e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2853,6 +2853,9 @@ struct SpecialJoinInfo
 	bool		semi_can_hash;	/* true if semi_operators are all hash */
 	List	   *semi_operators; /* OIDs of equality join operators */
 	List	   *semi_rhs_exprs; /* righthand-side expressions of these ops */
+
+	/* For outer join, fraction of rows without a match. */
+	Selectivity	unmatched_frac;
 };
 
 /*

Andrey Lepikhov

a.lepikhov@postgrespro.ru

over 2 years ago

In reply to: Tomas Vondra (#3)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

On 24/6/2023 17:23, Tomas Vondra wrote:

I really hope what I just wrote makes at least a little bit of sense.

Throw in one more example:

Here you can see the same kind of underestimation:
Hash Left Join (... rows=500 width=14) (... rows=99999 ...)

So the eqjoinsel_unmatch_left() function should be modified for the case
where nd1<nd2.

--
regards,
Andrey Lepikhov
Postgres Professional

Alena Rybakina

lena.ribackina@yandex.ru

over 2 years ago

In reply to: Andrey Lepikhov (#4)

1 attachment(s)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

Hi, all!

On 24.06.2023 14:23, Tomas Vondra wrote:

On 6/24/23 02:08, Tom Lane wrote:

Tomas Vondra<tomas.vondra@enterprisedb.com> writes:

The problem is that the selectivity for "IS NULL" is estimated using the
table-level statistics. But the LEFT JOIN entirely breaks the idea that
the null_frac has anything to do with NULLs in the join result.

Right.

I wonder how to improve this, say by adjusting the IS NULL selectivity
when we know to operate on the outer side of the join. We're able to
do this for antijoins, so maybe we could do that here, somehow?

This mess is part of the long-term plan around the work I've been doing
on outer-join-aware Vars. We now have infrastructure that can let
the estimator routines see "oh, this Var isn't directly from a scan
of its table, it's been passed through a potentially-nulling outer
join --- and I can see which one". I don't have more than vague ideas
about what happens next, but that is clearly an essential step on the
road to doing better.

I was wondering if that work on outer-join-aware Vars could help with
this, but I wasn't following it very closely. I agree the ability to
check if the Var could be NULL due to an outer join seems useful, as it
says whether applying raw attribute statistics makes sense or not.

I was thinking about what to do for the case when that's not possible,
i.e. when the Var refers to nullable side of the join. Knowing that this
is happening is clearly not enough - we need to know how many new NULLs
are "injected" into the join result, and "communicate" that to the
estimation routines.

Attached is a very ugly experimental patch doing that, and with it the
estimate changes to this:

QUERY PLAN
----------------------------------------------------------------------
Hash Left Join (cost=3.25..18179.88 rows=999900 width=16)
(actual time=0.528..596.151 rows=999900 loops=1)
Hash Cond: (large.id = small.id)
Filter: ((small.id IS NULL) OR
(large.a = ANY ('{1000,2000,3000,4000,5000}'::integer[])))
Rows Removed by Filter: 100
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.069..176.138 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8)
(actual time=0.371..0.373 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.032..0.146 rows=100 loops=1)
Planning Time: 3.845 ms
Execution Time: 712.405 ms
(10 rows)

Seems nice, but. The patch is pretty ugly, I don't claim it works for
other queries or that this is exactly what we should do. It calculates
"unmatched frequency" next to eqjoinsel_inner, stashes that info into
sjinfo and the estimator (nulltestsel) then uses that to adjust the
nullfrac it gets from the statistics.

The good thing is this helps even for IS NULL checks on non-join-key
columns (where we don't switch to an antijoin), but there's a couple
things that I dislike ...

1) It's not restricted to outer joins or anything like that (this is
mostly just my laziness / interest in one particular query, but also
something the outer-join-aware patch might help with).

2) We probably don't want to pass this kind of information through
sjinfo. That was the simplest thing for an experimental patch, but I
suspect it's not the only piece of information we may need to pass to
the lower levels of estimation code.

3) I kinda doubt we actually want to move this responsibility (to
consider fraction of unmatched rows) to the low-level estimation
routines (e.g. nulltestsel and various others). AFAICS this just
"introduces NULLs" into the relation, so maybe we could "adjust" the
attribute statistics (in examine_variable?) by inflating null_frac and
modifying the other frequencies in MCV/histogram.

4) But I'm not sure we actually want to do that in these low-level
selectivity functions. The outer join essentially produces output with
two subsets - one with matches on the outer side, one without them. But
the side without matches has NULLs in all columns. In a way, we know
exactly how are these columns correlated - if we do the usual estimation
(even with the null_frac adjusted), we just throw this information away.
And when there's a lot of rows without a match, that seems bad.

So maybe we should split the join estimate into two parts, one for each
subset of the join result. One for the rows with a match (and then we
can just do what we do now, with the attribute stats we already have).
And one for the "unmatched part" where we know the values on the outer
side are NULL (and then we can easily "fake" stats with null_frac=1.0).

I really hope what I just wrote makes at least a little bit of sense.

regards

I am also interested in this problem.

I did some refactoring of the source code in the patch, moved the
calculation of unmatched_fraction to eqjoinsel_inner.
I wrote myself in this commit as a co-author, if you don't mind, and I'm
going to continue working.

On 26.06.2023 12:22, Andrey Lepikhov wrote:

On 24/6/2023 17:23, Tomas Vondra wrote:

I really hope what I just wrote makes at least a little bit of sense.

Throw in one more example:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;

Here you can see the same kind of underestimation:
Hash Left Join (... rows=500 width=14) (... rows=99999 ...)

So the eqjoinsel_unmatch_left() function should be modified for the
case where nd1<nd2.

Unfortunately, this patch could not fix the cardinality calculation in
this request, I'll try to look and figure out what is missing here.

*postgres=# SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
SELECT 100000
CREATE TABLE
INSERT 0 2
ANALYZE
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1.04..1819.07 rows=1 width=14) (actual
time=0.143..114.792 rows=99999 loops=1)
   Hash Cond: (l.id = r.id)
   Filter: (r.v IS NULL)
   Rows Removed by Filter: 1
   -> Seq Scan on l (cost=0.00..1443.00 rows=100000 width=4) (actual
time=0.027..35.278 rows=100000 loops=1)
   -> Hash (cost=1.02..1.02 rows=2 width=10) (actual
time=0.014..0.017 rows=2 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 9kB
         -> Seq Scan on r (cost=0.00..1.02 rows=2 width=10) (actual
time=0.005..0.007 rows=2 loops=1)
Planning Time: 0.900 ms
Execution Time: 126.180 ms
(10 rows)*

As in the previous query, even with applied the patch, the cardinality
is calculated poorly here, I would even say that it has become worse:

EXPLAIN ANALYZE
SELECT * FROM large FULL JOIN small ON (large.id = small.id)
WHERE (large.a IS NULL);

MASTER:

*QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Merge Full Join (cost=127921.69..299941.59 rows=56503 width=16)
(actual time=795.092..795.094 rows=0 loops=1)
   Merge Cond: (small.id = large.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000
   -> Sort (cost=158.51..164.16 rows=2260 width=8) (actual
time=0.038..0.046 rows=100 loops=1)
         Sort Key: small.id
         Sort Method: quicksort Memory: 29kB
         -> Seq Scan on small (cost=0.00..32.60 rows=2260 width=8)
(actual time=0.013..0.022 rows=100 loops=1)
   -> Materialize (cost=127763.19..132763.44 rows=1000050 width=8)
(actual time=363.016..649.103 rows=1000000 loops=1)
         -> Sort (cost=127763.19..130263.31 rows=1000050 width=8)
(actual time=363.012..481.480 rows=1000000 loops=1)
               Sort Key: large.id
               Sort Method: external merge Disk: 17664kB
               -> Seq Scan on large (cost=0.00..14425.50 rows=1000050
width=8) (actual time=0.009..111.166 rows=1000000 loops=1)
Planning Time: 0.124 ms
Execution Time: 797.139 ms
(15 rows)*

With patch:

*QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Hash Full Join (cost=3.25..18179.25 rows=999900 width=16) (actual
time=261.480..261.482 rows=0 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000
   -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.006..92.827 rows=1000000 loops=1)
   -> Hash (cost=2.00..2.00 rows=100 width=8) (actual
time=0.032..0.034 rows=100 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 12kB
         -> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.008..0.015 rows=100 loops=1)
Planning Time: 0.151 ms
Execution Time: 261.529 ms
(10 rows)
*

In addition, I found a few more queries, where the estimation of
cardinality with the patch has become better:

EXPLAIN ANALYZE
SELECT * FROM small LEFT JOIN large ON (large.id = small.id)
WHERE (small.b IS NULL);

MASTER:

*QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Hash Right Join (cost=32.74..18758.45 rows=55003 width=16) (actual
time=0.100..0.104 rows=0 loops=1)
   Hash Cond: (large.id = small.id)
   -> Seq Scan on large (cost=0.00..14425.50 rows=1000050 width=8)
(never executed)
   -> Hash (cost=32.60..32.60 rows=11 width=8) (actual
time=0.089..0.091 rows=0 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 8kB
         -> Seq Scan on small (cost=0.00..32.60 rows=11 width=8)
(actual time=0.088..0.088 rows=0 loops=1)
               Filter: (b IS NULL)
               Rows Removed by Filter: 100
Planning Time: 0.312 ms
Execution Time: 0.192 ms
(10 rows)*

With patch:

*QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Hash Right Join (cost=2.01..18177.02 rows=1 width=16) (actual
time=0.127..0.132 rows=0 loops=1)
   Hash Cond: (large.id = small.id)
   -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(never executed)
   -> Hash (cost=2.00..2.00 rows=1 width=8) (actual time=0.112..0.114
rows=0 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 8kB
         -> Seq Scan on small (cost=0.00..2.00 rows=1 width=8)
(actual time=0.111..0.111 rows=0 loops=1)
               Filter: (b IS NULL)
               Rows Removed by Filter: 100
Planning Time: 0.984 ms
Execution Time: 0.237 ms
(10 rows)*

EXPLAIN ANALYZE
SELECT * FROM large FULL JOIN small ON (large.id = small.id)
WHERE (small.b IS NULL);

MASTER:

*QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Merge Full Join (cost=127921.69..299941.59 rows=56503 width=16)
(actual time=339.478..819.232 rows=999900 loops=1)
   Merge Cond: (small.id = large.id)
   Filter: (small.b IS NULL)
   Rows Removed by Filter: 100
   -> Sort (cost=158.51..164.16 rows=2260 width=8) (actual
time=0.129..0.136 rows=100 loops=1)
         Sort Key: small.id
         Sort Method: quicksort Memory: 29kB
         -> Seq Scan on small (cost=0.00..32.60 rows=2260 width=8)
(actual time=0.044..0.075 rows=100 loops=1)
   -> Materialize (cost=127763.19..132763.44 rows=1000050 width=8)
(actual time=339.260..605.444 rows=1000000 loops=1)
         -> Sort (cost=127763.19..130263.31 rows=1000050 width=8)
(actual time=339.254..449.930 rows=1000000 loops=1)
               Sort Key: large.id
               Sort Method: external merge Disk: 17664kB
               -> Seq Scan on large (cost=0.00..14425.50 rows=1000050
width=8) (actual time=0.032..104.484 rows=1000000 loops=1)
Planning Time: 0.324 ms
Execution Time: 859.705 ms
(15 rows)
*

With patch:

*QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Hash Full Join (cost=3.25..18179.25 rows=999900 width=16) (actual
time=0.162..349.683 rows=999900 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: (small.b IS NULL)
   Rows Removed by Filter: 100
   -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.021..95.972 rows=1000000 loops=1)
   -> Hash (cost=2.00..2.00 rows=100 width=8) (actual
time=0.125..0.127 rows=100 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 12kB
         -> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.030..0.059 rows=100 loops=1)
Planning Time: 0.218 ms
Execution Time: 385.819 ms
(10 rows)
*

EXPLAIN ANALYZE
SELECT * FROM large RIGHT JOIN small ON (large.id = small.id)
WHERE (large.a IS NULL);

MASTER:

*QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Merge Left Join (cost=127921.69..299941.59 rows=56503 width=16)
(actual time=345.403..345.404 rows=0 loops=1)
   Merge Cond: (small.id = large.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 100
   -> Sort (cost=158.51..164.16 rows=2260 width=8) (actual
time=0.033..0.039 rows=100 loops=1)
         Sort Key: small.id
         Sort Method: quicksort Memory: 29kB
         -> Seq Scan on small (cost=0.00..32.60 rows=2260 width=8)
(actual time=0.012..0.020 rows=100 loops=1)
   -> Materialize (cost=127763.19..132763.44 rows=1000050 width=8)
(actual time=345.287..345.315 rows=101 loops=1)
         -> Sort (cost=127763.19..130263.31 rows=1000050 width=8)
(actual time=345.283..345.295 rows=101 loops=1)
               Sort Key: large.id
               Sort Method: external merge Disk: 17664kB
               -> Seq Scan on large (cost=0.00..14425.50 rows=1000050
width=8) (actual time=0.009..104.648 rows=1000000 loops=1)
Planning Time: 0.098 ms
Execution Time: 347.807 ms
(15 rows)*

With patch:

*QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Hash Right Join (cost=3.25..18179.25 rows=100 width=16) (actual
time=209.838..209.842 rows=0 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 100
   -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.006..91.571 rows=1000000 loops=1)
   -> Hash (cost=2.00..2.00 rows=100 width=8) (actual
time=0.034..0.036 rows=100 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 12kB
         -> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.008..0.016 rows=100 loops=1)
Planning Time: 0.168 ms
Execution Time: 209.883 ms
(10 rows)*

Attachments:

0001-Fixed-the-case-of-calculating-underestimated-cardina.patchtext/x-patch; charset=UTF-8; name=0001-Fixed-the-case-of-calculating-underestimated-cardina.patchDownload

From e8653477fbe7321325122a2d5032798cd6a95a8b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Mon, 26 Jun 2023 15:39:11 +0300
Subject: [PATCH] Fixed the case of calculating underestimated cardinality for
 an LEFT JOIN with the restriction "IS NULL" in the clause. This error is
 caused by an incorrect calculation of selectivity in the "IS NULL" clause,
 since it took into account only table-level statistics without zero values in
 the results of the join operation. This patch fixes this by calculating the
 fraction of zero values on the right side of the join without of matching row
 on left.

Co-authored-by: Alena Rybakina <a.rybakina@postgrespro.ru>
---
 src/backend/utils/adt/selfuncs.c | 35 +++++++++++++++++++++++++++++---
 src/include/nodes/pathnodes.h    |  3 +++
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c4fcd0076ea..8e18aa1dd2b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -153,7 +153,7 @@ static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
 							  bool isdefault1, bool isdefault2,
 							  AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 							  Form_pg_statistic stats1, Form_pg_statistic stats2,
-							  bool have_mcvs1, bool have_mcvs2);
+							  bool have_mcvs1, bool have_mcvs2, double *unmatched_frac);
 static double eqjoinsel_semi(Oid opfuncoid, Oid collation,
 							 VariableStatData *vardata1, VariableStatData *vardata2,
 							 double nd1, double nd2,
@@ -1710,6 +1710,9 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
 		freq_null = stats->stanullfrac;
 
+		if (sjinfo)
+			freq_null = freq_null + sjinfo->unmatched_frac - freq_null * sjinfo->unmatched_frac;
+
 		switch (nulltesttype)
 		{
 			case IS_NULL:
@@ -2313,13 +2316,24 @@ eqjoinsel(PG_FUNCTION_ARGS)
 	}
 
 	/* We need to compute the inner-join selectivity in all cases */
+	/*
+	 * calculate fraction of right without of matching row on left
+	 *
+	 * FIXME Should be restricted to JOIN_LEFT, we should have similar logic
+	 * for JOIN_FULL.
+	 *
+	 * XXX Probably should calculate unmatched as fraction of the join result,
+	 * not of the relation on the right (because the matched part can have more
+	 * matches per row and thus grow). Not sure. Depends on how it's used later.
+	 */
 	selec_inner = eqjoinsel_inner(opfuncoid, collation,
 								  &vardata1, &vardata2,
 								  nd1, nd2,
 								  isdefault1, isdefault2,
 								  &sslot1, &sslot2,
 								  stats1, stats2,
-								  have_mcvs1, have_mcvs2);
+								  have_mcvs1, have_mcvs2,
+								  &sjinfo->unmatched_frac);
 
 	switch (sjinfo->jointype)
 	{
@@ -2407,7 +2421,7 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 				bool isdefault1, bool isdefault2,
 				AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 				Form_pg_statistic stats1, Form_pg_statistic stats2,
-				bool have_mcvs1, bool have_mcvs2)
+				bool have_mcvs1, bool have_mcvs2, double *unmatched_frac)
 {
 	double		selec;
 
@@ -2503,7 +2517,10 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		}
 		CLAMP_PROBABILITY(matchfreq1);
 		CLAMP_PROBABILITY(unmatchfreq1);
+
+		*unmatched_frac = unmatchfreq1;
 		matchfreq2 = unmatchfreq2 = 0.0;
+
 		for (i = 0; i < sslot2->nvalues; i++)
 		{
 			if (hasmatch2[i])
@@ -2581,10 +2598,22 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		double		nullfrac2 = stats2 ? stats2->stanullfrac : 0.0;
 
 		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2);
+
+		/*
+		 * XXX Should this look at nullfrac on either side? Probably depends on
+		 * if we're calculating fraction of NULLs or fraction of unmatched rows.
+		 */
+		// unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
 		if (nd1 > nd2)
+		{
 			selec /= nd1;
+			*unmatched_frac = (nd1 - nd2) * 1.0 / nd1;
+		}
 		else
+		{
 			selec /= nd2;
+			*unmatched_frac = 0.0;
+		}
 	}
 
 	return selec;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c17b53f7adb..6bc63e648e6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2853,6 +2853,9 @@ struct SpecialJoinInfo
 	bool		semi_can_hash;	/* true if semi_operators are all hash */
 	List	   *semi_operators; /* OIDs of equality join operators */
 	List	   *semi_rhs_exprs; /* righthand-side expressions of these ops */
+
+	/* For outer join, fraction of rows without a match. */
+	Selectivity	unmatched_frac;
 };
 
 /*
-- 
2.34.1

Tomas Vondra

tomas.vondra@enterprisedb.com

over 2 years ago

In reply to: Alena Rybakina (#5)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

On 6/26/23 20:15, Alena Rybakina wrote:

Hi, all!

On 24.06.2023 14:23, Tomas Vondra wrote:

On 6/24/23 02:08, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

The problem is that the selectivity for "IS NULL" is estimated using the
table-level statistics. But the LEFT JOIN entirely breaks the idea that
the null_frac has anything to do with NULLs in the join result.

Right.

I wonder how to improve this, say by adjusting the IS NULL selectivity
when we know to operate on the outer side of the join. We're able to
do this for antijoins, so maybe we could do that here, somehow?

This mess is part of the long-term plan around the work I've been doing
on outer-join-aware Vars. We now have infrastructure that can let
the estimator routines see "oh, this Var isn't directly from a scan
of its table, it's been passed through a potentially-nulling outer
join --- and I can see which one". I don't have more than vague ideas
about what happens next, but that is clearly an essential step on the
road to doing better.

I was wondering if that work on outer-join-aware Vars could help with
this, but I wasn't following it very closely. I agree the ability to
check if the Var could be NULL due to an outer join seems useful, as it
says whether applying raw attribute statistics makes sense or not.

I was thinking about what to do for the case when that's not possible,
i.e. when the Var refers to nullable side of the join. Knowing that this
is happening is clearly not enough - we need to know how many new NULLs
are "injected" into the join result, and "communicate" that to the
estimation routines.

Attached is a very ugly experimental patch doing that, and with it the
estimate changes to this:

QUERY PLAN
----------------------------------------------------------------------
Hash Left Join (cost=3.25..18179.88 rows=999900 width=16)
(actual time=0.528..596.151 rows=999900 loops=1)
Hash Cond: (large.id = small.id)
Filter: ((small.id IS NULL) OR
(large.a = ANY ('{1000,2000,3000,4000,5000}'::integer[])))
Rows Removed by Filter: 100
-> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.069..176.138 rows=1000000 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=8)
(actual time=0.371..0.373 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.032..0.146 rows=100 loops=1)
Planning Time: 3.845 ms
Execution Time: 712.405 ms
(10 rows)

Seems nice, but. The patch is pretty ugly, I don't claim it works for
other queries or that this is exactly what we should do. It calculates
"unmatched frequency" next to eqjoinsel_inner, stashes that info into
sjinfo and the estimator (nulltestsel) then uses that to adjust the
nullfrac it gets from the statistics.

The good thing is this helps even for IS NULL checks on non-join-key
columns (where we don't switch to an antijoin), but there's a couple
things that I dislike ...

1) It's not restricted to outer joins or anything like that (this is
mostly just my laziness / interest in one particular query, but also
something the outer-join-aware patch might help with).

2) We probably don't want to pass this kind of information through
sjinfo. That was the simplest thing for an experimental patch, but I
suspect it's not the only piece of information we may need to pass to
the lower levels of estimation code.

3) I kinda doubt we actually want to move this responsibility (to
consider fraction of unmatched rows) to the low-level estimation
routines (e.g. nulltestsel and various others). AFAICS this just
"introduces NULLs" into the relation, so maybe we could "adjust" the
attribute statistics (in examine_variable?) by inflating null_frac and
modifying the other frequencies in MCV/histogram.

4) But I'm not sure we actually want to do that in these low-level
selectivity functions. The outer join essentially produces output with
two subsets - one with matches on the outer side, one without them. But
the side without matches has NULLs in all columns. In a way, we know
exactly how are these columns correlated - if we do the usual estimation
(even with the null_frac adjusted), we just throw this information away.
And when there's a lot of rows without a match, that seems bad.

So maybe we should split the join estimate into two parts, one for each
subset of the join result. One for the rows with a match (and then we
can just do what we do now, with the attribute stats we already have).
And one for the "unmatched part" where we know the values on the outer
side are NULL (and then we can easily "fake" stats with null_frac=1.0).

I really hope what I just wrote makes at least a little bit of sense.

regards

I am also interested in this problem.

I did some refactoring of the source code in the patch, moved the
calculation of unmatched_fraction to eqjoinsel_inner.
I wrote myself in this commit as a co-author, if you don't mind, and I'm
going to continue working.

Sure, if you want to take over the patch and continue working on it,
that's perfectly fine with me. I'm happy to cooperate on it, doing
reviews etc.

On 26.06.2023 12:22, Andrey Lepikhov wrote:

On 24/6/2023 17:23, Tomas Vondra wrote:

I really hope what I just wrote makes at least a little bit of sense.

Throw in one more example:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;

Here you can see the same kind of underestimation:
Hash Left Join (... rows=500 width=14) (... rows=99999 ...)

So the eqjoinsel_unmatch_left() function should be modified for the
case where nd1<nd2.

Unfortunately, this patch could not fix the cardinality calculation in
this request, I'll try to look and figure out what is missing here.

*postgres=# SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
SELECT 100000
CREATE TABLE
INSERT 0 2
ANALYZE
                                                  QUERY
PLAN
---------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1.04..1819.07 rows=1 width=14) (actual
time=0.143..114.792 rows=99999 loops=1)
   Hash Cond: (l.id = r.id)
   Filter: (r.v IS NULL)
   Rows Removed by Filter: 1
   -> Seq Scan on l (cost=0.00..1443.00 rows=100000 width=4) (actual
time=0.027..35.278 rows=100000 loops=1)
   -> Hash (cost=1.02..1.02 rows=2 width=10) (actual time=0.014..0.017
rows=2 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 9kB
         -> Seq Scan on r (cost=0.00..1.02 rows=2 width=10) (actual
time=0.005..0.007 rows=2 loops=1)
Planning Time: 0.900 ms
Execution Time: 126.180 ms
(10 rows)*

As in the previous query, even with applied the patch, the cardinality
is calculated poorly here, I would even say that it has become worse:

EXPLAIN ANALYZE
SELECT * FROM large FULL JOIN small ON (large.id = small.id)
WHERE (large.a IS NULL);

MASTER:

*                                                              QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Merge Full Join (cost=127921.69..299941.59 rows=56503 width=16)
(actual time=795.092..795.094 rows=0 loops=1)
   Merge Cond: (small.id = large.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000
   -> Sort (cost=158.51..164.16 rows=2260 width=8) (actual
time=0.038..0.046 rows=100 loops=1)
         Sort Key: small.id
         Sort Method: quicksort Memory: 29kB
         -> Seq Scan on small (cost=0.00..32.60 rows=2260 width=8)
(actual time=0.013..0.022 rows=100 loops=1)
   -> Materialize (cost=127763.19..132763.44 rows=1000050 width=8)
(actual time=363.016..649.103 rows=1000000 loops=1)
         -> Sort (cost=127763.19..130263.31 rows=1000050 width=8)
(actual time=363.012..481.480 rows=1000000 loops=1)
               Sort Key: large.id
               Sort Method: external merge Disk: 17664kB
               -> Seq Scan on large (cost=0.00..14425.50 rows=1000050
width=8) (actual time=0.009..111.166 rows=1000000 loops=1)
Planning Time: 0.124 ms
Execution Time: 797.139 ms
(15 rows)*

With patch:

*                                                      QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------
Hash Full Join (cost=3.25..18179.25 rows=999900 width=16) (actual
time=261.480..261.482 rows=0 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000
   -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
(actual time=0.006..92.827 rows=1000000 loops=1)
   -> Hash (cost=2.00..2.00 rows=100 width=8) (actual
time=0.032..0.034 rows=100 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 12kB
         -> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
(actual time=0.008..0.015 rows=100 loops=1)
Planning Time: 0.151 ms
Execution Time: 261.529 ms
(10 rows)
*

In addition, I found a few more queries, where the estimation of
cardinality with the patch has become better:

Yes, this does not surprise me at all - the patch was very early /
experimental code, and I only really aimed it at that single example
query. So don't hesitate to rethink what/how the patch works.

I do think collecting / constructing a wider set of queries with joins
of different types is going to be an important step in working on this
patch and making sure it doesn't break something.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Alena Rybakina

lena.ribackina@yandex.ru

over 2 years ago

In reply to: Tomas Vondra (#6)

1 attachment(s)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

Hi, all!

On 26.06.2023 12:22, Andrey Lepikhov wrote:

On 24/6/2023 17:23, Tomas Vondra wrote:

I really hope what I just wrote makes at least a little bit of sense.

Throw in one more example:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;

Here you can see the same kind of underestimation:
Hash Left Join (... rows=500 width=14) (... rows=99999 ...)

So the eqjoinsel_unmatch_left() function should be modified for the
case where nd1<nd2.

Unfortunately, this patch could not fix the cardinality calculation in
this request, I'll try to look and figure out what is missing here.

I tried to fix the cardinality score in the query above by changing:

diff --git a/src/backend/utils/adt/selfuncs.c 
b/src/backend/utils/adt/selfuncs.c
index 8e18aa1dd2b..40901836146 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2604,11 +2604,16 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
                  * if we're calculating fraction of NULLs or fraction 
of unmatched rows.
                  */
                 // unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
-               if (nd1 > nd2)
+               if (nd1 != nd2)
                 {
-                       selec /= nd1;
-                       *unmatched_frac = (nd1 - nd2) * 1.0 / nd1;
+                       selec /= Max(nd1, nd2);
+                       *unmatched_frac = abs(nd1 - nd2) * 1.0 / 
Max(nd1, nd2);
                 }
+               /*if (nd1 > nd2)
+               {
+                       selec /= nd1;
+                       *unmatched_frac = nd1 - nd2 * 1.0 / nd1;
+               }*/
                 else
                 {
                         selec /= nd2;

and it worked:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
ERROR: relation "l" already exists
ERROR: relation "r" already exists
INSERT 0 2
ANALYZE
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1.09..1944.13 rows=99998 width=14) (actual
time=0.152..84.184 rows=99999 loops=1)
   Hash Cond: (l.id = r.id)
   Filter: (r.v IS NULL)
   Rows Removed by Filter: 2
   -> Seq Scan on l (cost=0.00..1443.00 rows=100000 width=4) (actual
time=0.040..27.635 rows=100000 loops=1)
   -> Hash (cost=1.04..1.04 rows=4 width=10) (actual
time=0.020..0.022 rows=4 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 9kB
         -> Seq Scan on r (cost=0.00..1.04 rows=4 width=10) (actual
time=0.009..0.011 rows=4 loops=1)
Planning Time: 0.954 ms
Execution Time: 92.309 ms
(10 rows)

It looks too simple and I suspect that I might have missed something
somewhere, but so far I haven't found any examples of queries where it
doesn't work.

I didn't see it breaking anything in the examples from my previous
letter [1].

1.
/messages/by-id/7af1464e-2e24-cfb1-b6d4-1544757f8cfa@yandex.ru

Unfortunately, I can't understand your idea from point 4, please explain it?

The good thing is this helps even for IS NULL checks on non-join-key
columns (where we don't switch to an antijoin), but there's a couple
things that I dislike ...

1) It's not restricted to outer joins or anything like that (this is
mostly just my laziness / interest in one particular query, but also
something the outer-join-aware patch might help with).

--
Regards,
Alena Rybakina
Postgres Professional

Attachments:

0001-Fixed-the-case-of-calculating-underestimated-cardina.patchtext/x-patch; charset=UTF-8; name=0001-Fixed-the-case-of-calculating-underestimated-cardina.patchDownload

From 54a24390f1137a77c9755a875774a75ae8cf2424 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Mon, 26 Jun 2023 15:39:11 +0300
Subject: [PATCH] Fixed the case of calculating underestimated cardinality for
 an LEFT JOIN with the restriction "IS NULL" in the clause. This error is
 caused by an incorrect calculation of selectivity in the "IS NULL" clause,
 since it took into account only table-level statistics without zero values in
 the results of the join operation. This patch fixes this by calculating the
 fraction of zero values on the right side of the join without of matching row
 on left.

Co-authored-by: Alena Rybakina <a.rybakina@postgrespro.ru>
---
 src/backend/utils/adt/selfuncs.c | 39 ++++++++++++++++++++++++++++----
 src/include/nodes/pathnodes.h    |  3 +++
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c4fcd0076ea..a0e3834453b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -153,7 +153,7 @@ static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
 							  bool isdefault1, bool isdefault2,
 							  AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 							  Form_pg_statistic stats1, Form_pg_statistic stats2,
-							  bool have_mcvs1, bool have_mcvs2);
+							  bool have_mcvs1, bool have_mcvs2, double *unmatched_frac);
 static double eqjoinsel_semi(Oid opfuncoid, Oid collation,
 							 VariableStatData *vardata1, VariableStatData *vardata2,
 							 double nd1, double nd2,
@@ -1710,6 +1710,9 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
 		freq_null = stats->stanullfrac;
 
+		if (sjinfo)
+			freq_null = freq_null + sjinfo->unmatched_frac - freq_null * sjinfo->unmatched_frac;
+
 		switch (nulltesttype)
 		{
 			case IS_NULL:
@@ -2313,13 +2316,24 @@ eqjoinsel(PG_FUNCTION_ARGS)
 	}
 
 	/* We need to compute the inner-join selectivity in all cases */
+	/*
+	 * calculate fraction of right without of matching row on left
+	 *
+	 * FIXME Should be restricted to JOIN_LEFT, we should have similar logic
+	 * for JOIN_FULL.
+	 *
+	 * XXX Probably should calculate unmatched as fraction of the join result,
+	 * not of the relation on the right (because the matched part can have more
+	 * matches per row and thus grow). Not sure. Depends on how it's used later.
+	 */
 	selec_inner = eqjoinsel_inner(opfuncoid, collation,
 								  &vardata1, &vardata2,
 								  nd1, nd2,
 								  isdefault1, isdefault2,
 								  &sslot1, &sslot2,
 								  stats1, stats2,
-								  have_mcvs1, have_mcvs2);
+								  have_mcvs1, have_mcvs2,
+								  &sjinfo->unmatched_frac);
 
 	switch (sjinfo->jointype)
 	{
@@ -2407,7 +2421,7 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 				bool isdefault1, bool isdefault2,
 				AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 				Form_pg_statistic stats1, Form_pg_statistic stats2,
-				bool have_mcvs1, bool have_mcvs2)
+				bool have_mcvs1, bool have_mcvs2, double *unmatched_frac)
 {
 	double		selec;
 
@@ -2503,7 +2517,10 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		}
 		CLAMP_PROBABILITY(matchfreq1);
 		CLAMP_PROBABILITY(unmatchfreq1);
+
+		*unmatched_frac = unmatchfreq1;
 		matchfreq2 = unmatchfreq2 = 0.0;
+
 		for (i = 0; i < sslot2->nvalues; i++)
 		{
 			if (hasmatch2[i])
@@ -2581,10 +2598,22 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		double		nullfrac2 = stats2 ? stats2->stanullfrac : 0.0;
 
 		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2);
-		if (nd1 > nd2)
-			selec /= nd1;
+
+		/*
+		 * XXX Should this look at nullfrac on either side? Probably depends on
+		 * if we're calculating fraction of NULLs or fraction of unmatched rows.
+		 */
+		// unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
+		if (nd1 != nd2)
+		{
+			selec /= Max(nd1, nd2);
+			*unmatched_frac = abs(nd1 - nd2) * 1.0 / Max(nd1, nd2);
+		}
 		else
+		{
 			selec /= nd2;
+			*unmatched_frac = 0.0;
+		}
 	}
 
 	return selec;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c17b53f7adb..6bc63e648e6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2853,6 +2853,9 @@ struct SpecialJoinInfo
 	bool		semi_can_hash;	/* true if semi_operators are all hash */
 	List	   *semi_operators; /* OIDs of equality join operators */
 	List	   *semi_rhs_exprs; /* righthand-side expressions of these ops */
+
+	/* For outer join, fraction of rows without a match. */
+	Selectivity	unmatched_frac;
 };
 
 /*
-- 
2.34.1

Tomas Vondra

tomas.vondra@enterprisedb.com

over 2 years ago

In reply to: Alena Rybakina (#7)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

On 7/6/23 15:51, Alena Rybakina wrote:

Hi, all!

On 26.06.2023 12:22, Andrey Lepikhov wrote:

On 24/6/2023 17:23, Tomas Vondra wrote:

I really hope what I just wrote makes at least a little bit of sense.

Throw in one more example:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;

Here you can see the same kind of underestimation:
Hash Left Join (... rows=500 width=14) (... rows=99999 ...)

So the eqjoinsel_unmatch_left() function should be modified for the
case where nd1<nd2.

Unfortunately, this patch could not fix the cardinality calculation in
this request, I'll try to look and figure out what is missing here.

I tried to fix the cardinality score in the query above by changing:
diff --git a/src/backend/utils/adt/selfuncs.c
b/src/backend/utils/adt/selfuncs.c
index 8e18aa1dd2b..40901836146 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2604,11 +2604,16 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
                 * if we're calculating fraction of NULLs or fraction of
unmatched rows.
                 */
                // unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
-               if (nd1 > nd2)
+               if (nd1 != nd2)
                {
-                       selec /= nd1;
-                       *unmatched_frac = (nd1 - nd2) * 1.0 / nd1;
+                       selec /= Max(nd1, nd2);
+                       *unmatched_frac = abs(nd1 - nd2) * 1.0 /
Max(nd1, nd2);
                }
+               /*if (nd1 > nd2)
+               {
+                       selec /= nd1;
+                       *unmatched_frac = nd1 - nd2 * 1.0 / nd1;
+               }*/
                else
                {
                        selec /= nd2;
and it worked:

SELECT i AS id INTO l FROM generate_series(1,100000) i;
CREATE TABLE r (id int8, v text);
INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
ANALYZE l,r;
EXPLAIN ANALYZE
SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
ERROR: relation "l" already exists
ERROR: relation "r" already exists
INSERT 0 2
ANALYZE
                                                  QUERY
PLAN
---------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1.09..1944.13 rows=99998 width=14) (actual
time=0.152..84.184 rows=99999 loops=1)
   Hash Cond: (l.id = r.id)
   Filter: (r.v IS NULL)
   Rows Removed by Filter: 2
   -> Seq Scan on l (cost=0.00..1443.00 rows=100000 width=4) (actual
time=0.040..27.635 rows=100000 loops=1)
   -> Hash (cost=1.04..1.04 rows=4 width=10) (actual time=0.020..0.022
rows=4 loops=1)
         Buckets: 1024 Batches: 1 Memory Usage: 9kB
         -> Seq Scan on r (cost=0.00..1.04 rows=4 width=10) (actual
time=0.009..0.011 rows=4 loops=1)
Planning Time: 0.954 ms
Execution Time: 92.309 ms
(10 rows)

It looks too simple and I suspect that I might have missed something
somewhere, but so far I haven't found any examples of queries where it
doesn't work.

I didn't see it breaking anything in the examples from my previous
letter [1].

I think it's correct. Or at least it doesn't break anything my patch
didn't already break. My patch was simply written for one specific
query, so it didn't consider the option that the nd1 and nd2 values
might be in the opposite direction ...

1.
/messages/by-id/7af1464e-2e24-cfb1-b6d4-1544757f8cfa@yandex.ru

Unfortunately, I can't understand your idea from point 4, please explain it?

The good thing is this helps even for IS NULL checks on non-join-key
columns (where we don't switch to an antijoin), but there's a couple
things that I dislike ...

1) It's not restricted to outer joins or anything like that (this is
mostly just my laziness / interest in one particular query, but also
something the outer-join-aware patch might help with).

2) We probably don't want to pass this kind of information through
sjinfo. That was the simplest thing for an experimental patch, but I
suspect it's not the only piece of information we may need to pass to
the lower levels of estimation code.

3) I kinda doubt we actually want to move this responsibility (to
consider fraction of unmatched rows) to the low-level estimation
routines (e.g. nulltestsel and various others). AFAICS this just
"introduces NULLs" into the relation, so maybe we could "adjust" the
attribute statistics (in examine_variable?) by inflating null_frac and
modifying the other frequencies in MCV/histogram.

4) But I'm not sure we actually want to do that in these low-level
selectivity functions. The outer join essentially produces output with
two subsets - one with matches on the outer side, one without them. But
the side without matches has NULLs in all columns. In a way, we know
exactly how are these columns correlated - if we do the usual estimation
(even with the null_frac adjusted), we just throw this information away.
And when there's a lot of rows without a match, that seems bad.

Well, one option would be to modify all selectivity functions to do
something like the patch does for nulltestsel(). That seems a bit
cumbersome because why should those places care about maybe running on
the outer side of a join, or what? For code in extensions this would be
particularly problematic, I think.

So what I was thinking about doing this in a way that'd make this
automatic, without having to modify the selectivity functions.

Option (3) is very simple - examine_variable would simply adjust the
statistics by tweaking the null_frac field, when looking at variables on
the outer side of the join. But it has issues when estimating multiple
conditions.

Imagine t1 has 1M rows, and we want to estimate

SELECT * FROM t1 LEFT JOIN t2 ON (t1.id = t2.id)
WHERE ((t2.a=1) AND (t2.b=1))

but only 50% of the t1 rows has a match in t2. Assume each of the t2
conditions matches 100% rows in the table. With the correction, this
means 50% selectivity for each condition. And if we combine them the
usual way, it's 0.5 * 0.5 = 0.25.

But we know all the rows in the "matching" part match the condition, so
the correct selectivity should be 0.5.

In a way, this is just another case of estimation issues due to the
assumption of independence.

But (4) was suggesting we could improve this essentially by treating the
join as two distinct sets of rows

- the inner join result

- rows without match on the outer side

For the inner part, we would do estimates as now (using the regular
per-column statistics). If we knew the conditions match 100% rows, we'd
still get 100% when the conditions are combined.

For the second part of the join we know the outer side is just NULLs in
all columns, and that'd make the estimation much simpler for most
clauses. We'd just need to have "fake" statistics with null_frac=1.0 and
that's it.

And then we'd just combine these two selectivities. If we know the inner
side is 50% and all rows match the conditions, and no rows in the other
50% match, the selectivity is 50%.

inner_part * inner_sel + outer_part * outer_sel = 0.5 * 1.0 + 0.0 = 0.5

Now, we still have issues with independence assumption in each of these
parts separately. But that's OK, I think.

I think (4) could be implemented by doing the current estimation for the
inner part, and by tweaking examine_variable in the "outer" part in a
way similar to (3). Except that it just sets null_frac=1.0 everywhere.

FWIW, I used "AND" in the example for simplicity, but that'd probably be
pushed to the baserel level. There'd need to be OR to keep it at the
join level, but the overall issue is the same, I think.

Also, this entirely ignores extended statistics - I have no idea how we
might tweak those in (3). For (4) we don't need to tweak those at all,
because for inner part we can just apply them as is, and for outer part
it's irrelevant because everything is NULL.

I hope this makes more sense. If not, let me know and I'll try to
explain it better.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Alena Rybakina

lena.ribackina@yandex.ru

over 2 years ago

In reply to: Tomas Vondra (#8)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

Well, one option would be to modify all selectivity functions to do
something like the patch does for nulltestsel(). That seems a bit
cumbersome because why should those places care about maybe running on
the outer side of a join, or what? For code in extensions this would be
particularly problematic, I think.

Agree. I would say that we can try it if nothing else works out.

So what I was thinking about doing this in a way that'd make this
automatic, without having to modify the selectivity functions.

Option (3) is very simple - examine_variable would simply adjust the
statistics by tweaking the null_frac field, when looking at variables on
the outer side of the join. But it has issues when estimating multiple
conditions.

Imagine t1 has 1M rows, and we want to estimate

SELECT * FROM t1 LEFT JOIN t2 ON (t1.id = t2.id)
WHERE ((t2.a=1) AND (t2.b=1))

but only 50% of the t1 rows has a match in t2. Assume each of the t2
conditions matches 100% rows in the table. With the correction, this
means 50% selectivity for each condition. And if we combine them the
usual way, it's 0.5 * 0.5 = 0.25.

But we know all the rows in the "matching" part match the condition, so
the correct selectivity should be 0.5.

In a way, this is just another case of estimation issues due to the
assumption of independence.
FWIW, I used "AND" in the example for simplicity, but that'd probably be
pushed to the baserel level. There'd need to be OR to keep it at the
join level, but the overall issue is the same, I think.

Also, this entirely ignores extended statistics - I have no idea how we
might tweak those in (3).

I understood the idea - it is very similar to what is implemented in the
current patch.

But I don't understand how to do it in the examine_variable function, to
be honest.

But (4) was suggesting we could improve this essentially by treating the
join as two distinct sets of rows

- the inner join result

- rows without match on the outer side

For the inner part, we would do estimates as now (using the regular
per-column statistics). If we knew the conditions match 100% rows, we'd
still get 100% when the conditions are combined.

For the second part of the join we know the outer side is just NULLs in
all columns, and that'd make the estimation much simpler for most
clauses. We'd just need to have "fake" statistics with null_frac=1.0 and
that's it.

And then we'd just combine these two selectivities. If we know the inner
side is 50% and all rows match the conditions, and no rows in the other
50% match, the selectivity is 50%.

inner_part * inner_sel + outer_part * outer_sel = 0.5 * 1.0 + 0.0 = 0.5

Now, we still have issues with independence assumption in each of these
parts separately. But that's OK, I think.

I think (4) could be implemented by doing the current estimation for the
inner part, and by tweaking examine_variable in the "outer" part in a
way similar to (3). Except that it just sets null_frac=1.0 everywhere.

For (4) we don't need to tweak those at all,
because for inner part we can just apply them as is, and for outer part
it's irrelevant because everything is NULL.

I like this idea the most) I'll try to start with this and implement the
patch.

I hope this makes more sense. If not, let me know and I'll try to
explain it better.

Thank you for your explanation)

I will unsubscribe soon based on the results or if I have any questions.

--
Regards,
Alena Rybakina
Postgres Professional

#10

Tomas Vondra

tomas.vondra@enterprisedb.com

over 2 years ago

In reply to: Alena Rybakina (#9)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

On 7/8/23 10:29, Alena Rybakina wrote:

Well, one option would be to modify all selectivity functions to do
something like the patch does for nulltestsel(). That seems a bit
cumbersome because why should those places care about maybe running on
the outer side of a join, or what? For code in extensions this would be
particularly problematic, I think.

Agree. I would say that we can try it if nothing else works out.

So what I was thinking about doing this in a way that'd make this
automatic, without having to modify the selectivity functions.

Option (3) is very simple - examine_variable would simply adjust the
statistics by tweaking the null_frac field, when looking at variables on
the outer side of the join. But it has issues when estimating multiple
conditions.

Imagine t1 has 1M rows, and we want to estimate

SELECT * FROM t1 LEFT JOIN t2 ON (t1.id = t2.id)
WHERE ((t2.a=1) AND (t2.b=1))

but only 50% of the t1 rows has a match in t2. Assume each of the t2
conditions matches 100% rows in the table. With the correction, this
means 50% selectivity for each condition. And if we combine them the
usual way, it's 0.5 * 0.5 = 0.25.

But we know all the rows in the "matching" part match the condition, so
the correct selectivity should be 0.5.

In a way, this is just another case of estimation issues due to the
assumption of independence.
FWIW, I used "AND" in the example for simplicity, but that'd probably be
pushed to the baserel level. There'd need to be OR to keep it at the
join level, but the overall issue is the same, I think.

Also, this entirely ignores extended statistics - I have no idea how we
might tweak those in (3).

I understood the idea - it is very similar to what is implemented in the
current patch.

But I don't understand how to do it in the examine_variable function, to
be honest.

Well, I don't have a detailed plan either. In principle it shouldn't be
that hard, I think - examine_variable is loading the statistics, so it
could apply the same null_frac correction, just like nulltestsel would
do a bit later.

The main question is how to pass the information to examine_variable. It
doesn't get the SpecialJoinInfo (which is what nulltestsel used), so
we'd need to invent something new ... add a new argument?

But (4) was suggesting we could improve this essentially by treating the
join as two distinct sets of rows

- the inner join result

- rows without match on the outer side

For the inner part, we would do estimates as now (using the regular
per-column statistics). If we knew the conditions match 100% rows, we'd
still get 100% when the conditions are combined.

For the second part of the join we know the outer side is just NULLs in
all columns, and that'd make the estimation much simpler for most
clauses. We'd just need to have "fake" statistics with null_frac=1.0 and
that's it.

And then we'd just combine these two selectivities. If we know the inner
side is 50% and all rows match the conditions, and no rows in the other
50% match, the selectivity is 50%.

inner_part * inner_sel + outer_part * outer_sel = 0.5 * 1.0 + 0.0 = 0.5

Now, we still have issues with independence assumption in each of these
parts separately. But that's OK, I think.

I think (4) could be implemented by doing the current estimation for the
inner part, and by tweaking examine_variable in the "outer" part in a
way similar to (3). Except that it just sets null_frac=1.0 everywhere.

For (4) we don't need to tweak those at all,
because for inner part we can just apply them as is, and for outer part
it's irrelevant because everything is NULL.

I like this idea the most) I'll try to start with this and implement the
patch.

Good to hear.

I hope this makes more sense. If not, let me know and I'll try to
explain it better.

Thank you for your explanation)

I will unsubscribe soon based on the results or if I have any questions.

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11

Alena Rybakina

lena.ribackina@yandex.ru

over 2 years ago

In reply to: Tomas Vondra (#10)

1 attachment(s)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

Well, I don't have a detailed plan either. In principle it shouldn't be
that hard, I think - examine_variable is loading the statistics, so it
could apply the same null_frac correction, just like nulltestsel would
do a bit later.

The main question is how to pass the information to examine_variable. It
doesn't get the SpecialJoinInfo (which is what nulltestsel used), so
we'd need to invent something new ... add a new argument?

Sorry I didn't answer right away, I could adapt the last version of the
patch [2]/messages/by-id/148ff8f1-067b-1409-c754-af6117de9b7d@yandex.ru to the current idea, but so far I have implemented
it only for the situation where we save the number of zero values in
SpecialJoinInfo variable.

I'm starting to look for different functions scalararraysel_containment,
boolvarsel and I try to find some bad cases for current problem,
when I can fix in similar way it in those functions. But I'm not sure
about different ones functions:
(mergejoinscansel, estimate_num_groups, estimate_hash_bucket_stats,
get_restriction_variable, get_join_variables).

The examine_variable function is also called in them, and even though
there is no being outer join in them
(the absence of a SpecialJoinInfo variable), I can't think of similar
cases, when we have such problem caused by the same reasons.

The code passes all regression tests and I found no deterioration in
cardinality prediction for queries from [1]https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg146044.html, except for one:

EXPLAIN ANALYZE
SELECT * FROM large FULL JOIN small ON (large.id = small.id)
WHERE (large.a IS NULL);

MASTER:

*QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------

Merge Full Join (cost=127921.69..299941.59 rows=56503
width=16)(actual time=795.092..795.094 rows=0 loops=1)

   Merge Cond: (small.id = large.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000

-> Sort (cost=158.51..164.16 rows=2260 width=8)
(actualtime=0.038..0.046 rows=100 loops=1)

Sort Key: small.id
Sort Method: quicksort Memory: 29kB

         -> Seq Scan on small (cost=0.00..32.60 rows=2260
width=8)(actual time=0.013..0.022 rows=100 loops=1)   -> Materialize
(cost=127763.19..132763.44 rows=1000050 width=8)(actual
time=363.016..649.103 rows=1000000 loops=1)         -> Sort
(cost=127763.19..130263.31 rows=1000050 width=8)(actual
time=363.012..481.480 rows=1000000 loops=1)

Sort Key: large.id
Sort Method: external merge Disk: 17664kB

-> Seq Scan on large (cost=0.00..14425.50
rows=1000050width=8) (actual time=0.009..111.166 rows=1000000 loops=1)

Planning Time: 0.124 ms
Execution Time: 797.139 ms
(15 rows)*

With patch:

*QUERY PLAN
----------------------------------------------------------------------------------------------------------------------

Hash Full Join (cost=3.25..18179.25 rows=999900 width=16)
(actualtime=261.480..261.482 rows=0 loops=1)

   Hash Cond: (large.id = small.id)
   Filter: (large.a IS NULL)
   Rows Removed by Filter: 1000000

-> Seq Scan on large (cost=0.00..14425.00 rows=1000000
width=8)(actual time=0.006..92.827 rows=1000000 loops=1) -> Hash
(cost=2.00..2.00 rows=100 width=8) (actualtime=0.032..0.034 rows=100
loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 12kB

-> Seq Scan on small (cost=0.00..2.00 rows=100
width=8)(actual time=0.008..0.015 rows=100 loops=1)

Planning Time: 0.151 ms
Execution Time: 261.529 ms
(10 rows)

[1]: https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg146044.html
https://www.mail-archive.com/pgsql-hackers@lists.postgresql.org/msg146044.html

[2]: /messages/by-id/148ff8f1-067b-1409-c754-af6117de9b7d@yandex.ru
/messages/by-id/148ff8f1-067b-1409-c754-af6117de9b7d@yandex.ru

Unfortunately, I found that my previous change leads to a big error in
predicting selectivity.
I will investigate this case in more detail and try to find a solution
(I wrote the code and query below).

// unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
if (nd1 != nd2)
{
        selec /= Max(nd1, nd2);
        *unmatched_frac = abs(nd1 - nd2) * 1.0 / Max(nd1, nd2);
}
else
{
        selec /= nd2;
        *unmatched_frac = 0.0;
}

postgres=# explain analyze select * from large l1 inner join large l2 on
l1.a is null where l1.a <100;

MASTER:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1000.00..35058.43 rows=1000000 width=16) (actual
time=91.846..93.622 rows=0 loops=1)
   -> Gather (cost=1000.00..10633.43 rows=1 width=8) (actual
time=91.844..93.619 rows=0 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -> Parallel Seq Scan on large l1 (cost=0.00..9633.33 rows=1
width=8) (actual time=86.153..86.154 rows=0 loops=3)
               Filter: ((a IS NULL) AND (a < 100))
               Rows Removed by Filter: 333333
   -> Seq Scan on large l2 (cost=0.00..14425.00 rows=1000000 width=8)
(never executed)
Planning Time: 0.299 ms
Execution Time: 93.689 ms

(10 rows)

With patch:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1000.00..20863771.84 rows=1667083350 width=16)
(actual time=290.255..290.323 rows=0 loops=1)
   -> Seq Scan on large l2 (cost=0.00..14425.50 rows=1000050 width=8)
(actual time=0.104..94.037 rows=1000000 loops=1)
   -> Materialize (cost=1000.00..10808.63 rows=1667 width=8) (actual
time=0.000..0.000 rows=0 loops=1000000)
         -> Gather (cost=1000.00..10800.29 rows=1667 width=8) (actual
time=79.472..79.539 rows=0 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               -> Parallel Seq Scan on large l1 (cost=0.00..9633.59
rows=695 width=8) (actual time=75.337..75.338 rows=0 loops=3)
                     Filter: ((a IS NULL) AND (a < 100))
                     Rows Removed by Filter: 333333
Planning Time: 0.721 ms
Execution Time: 290.425 ms

(11 rows)

I remember, it could fix this one:

Hash Left Join (cost=1.09..1944.13 rows=99998 width=14)
(actualtime=0.152..84.184 rows=99999 loops=1)

   Hash Cond: (l.id = r.id)
   Filter: (r.v IS NULL)
   Rows Removed by Filter: 2

-> Seq Scan on l (cost=0.00..1443.00 rows=100000 width=4)
(actualtime=0.040..27.635 rows=100000 loops=1) -> Hash
(cost=1.04..1.04 rows=4 width=10) (actualtime=0.020..0.022 rows=4 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 9kB

-> Seq Scan on r (cost=0.00..1.04 rows=4 width=10)
(actualtime=0.009..0.011 rows=4 loops=1)

Planning Time: 0.954 ms
Execution Time: 92.309 ms
(10 rows)

Do you think it's worth trying to apply the fourth method now?
As far as I remember, here we will face the problem of estimating
multiple conditions, in the 4th approach there is a chance to avoid this.

I couldn't get such case. I found only one that I also found
interesting, but so far I haven't been able to figure out well enough
what influenced the prediction of cardinality and, accordingly, the
choice of a better plan.

CREATE TABLE large (id INT, a INT);
INSERT INTO large SELECT i, 1 FROM generate_series(1,50000) s(i);
CREATE TABLE small (id INT, b INT);
INSERT INTO small SELECT i, 1 FROM generate_series(1,25000) s(i);
INSERT INTO large SELECT i+50000, i+2 FROM generate_series(1,50000) s(i);
INSERT INTO small SELECT i+25000, i+2 FROM generate_series(1,25000) s(i);

explain analyze SELECT * FROM large LEFT JOIN small ON (large.id = small.id)
WHERE ((large.a=1) OR (small.b=1));

With patch:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1347.00..3911.91 rows=99775 width=16) (actual
time=36.864..82.634 rows=50000 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: ((large.a = 1) OR (small.b = 1))
   Rows Removed by Filter: 50000
   -> Seq Scan on large (cost=0.00..1440.75 rows=99775 width=8)
(actual time=0.034..12.536 rows=100000 loops=1)
   -> Hash (cost=722.00..722.00 rows=50000 width=8) (actual
time=36.752..36.754 rows=50000 loops=1)
         Buckets: 65536 Batches: 1 Memory Usage: 2466kB
         -> Seq Scan on small (cost=0.00..722.00 rows=50000 width=8)
(actual time=0.028..13.337 rows=50000 loops=1)
Planning Time: 2.363 ms
Execution Time: 84.790 ms
(10 rows)

original:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Merge Right Join (cost=14400.45..516963.33 rows=250528 width=16)
(actual time=85.498..126.799 rows=50000 loops=1)
   Merge Cond: (small.id = large.id)
   Filter: ((large.a = 1) OR (small.b = 1))
   Rows Removed by Filter: 50000
   -> Sort (cost=4640.80..4766.23 rows=50172 width=8) (actual
time=41.538..44.204 rows=50000 loops=1)
         Sort Key: small.id
         Sort Method: quicksort Memory: 3710kB
         -> Seq Scan on small (cost=0.00..723.72 rows=50172 width=8)
(actual time=0.068..15.182 rows=50000 loops=1)
   -> Sort (cost=9759.65..10009.95 rows=100118 width=8) (actual
time=43.939..53.697 rows=100000 loops=1)
         Sort Key: large.id
         Sort Method: external sort Disk: 2160kB
         -> Seq Scan on large (cost=0.00..1444.18 rows=100118
width=8) (actual time=0.046..10.378 rows=100000 loops=1)
Planning Time: 0.406 ms
Execution Time: 130.109 ms
(14 rows)

If you disable merge join with the original code, then the plans
coincide, as well as the error value approximately, only in one case
cardinality
overestimation occurs in the other vice versa.

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1347.00..3915.00 rows=75082 width=16) (actual
time=43.625..68.901 rows=50000 loops=1)
   Hash Cond: (large.id = small.id)
   Filter: ((large.a = 1) OR (small.b = 1))
   Rows Removed by Filter: 50000
   -> Seq Scan on large (cost=0.00..1443.00 rows=100000 width=8)
(actual time=0.008..9.895 rows=100000 loops=1)
   -> Hash (cost=722.00..722.00 rows=50000 width=8) (actual
time=22.546..22.548 rows=50000 loops=1)
         Buckets: 65536 Batches: 1 Memory Usage: 2466kB
         -> Seq Scan on small (cost=0.00..722.00 rows=50000 width=8)
(actual time=0.006..7.218 rows=50000 loops=1)
Planning Time: 0.239 ms
Execution Time: 70.860 ms
(10 rows)

--
Regards,
Alena Rybakina
Postgres Professional

Attachments:

0001-Fix-calculating-the-underestimated-cardinality.patchtext/x-patch; charset=UTF-8; name=0001-Fix-calculating-the-underestimated-cardinality.patchDownload

From 47daf6945403425f3a09a2df450a2cd2e96ae23c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Date: Mon, 10 Jul 2023 17:29:41 +0300
Subject: [PATCH] Fix the case of calculating the underestimated cardinality
 for LEFT/RIGHT/FULL OUTER JOIN, where zero elements were not taken into
 account as a result of the connection operation. This patch fixes this by
 calculating the fraction of zero values on the one side of the join without
 of matching row on the other side.

Co-authored-by: Alena Rybakina <a.rybakina@postgrespro.ru>
---
 src/backend/utils/adt/array_selfuncs.c |   2 +-
 src/backend/utils/adt/selfuncs.c       | 314 ++++++++++++++-----------
 src/include/nodes/pathnodes.h          |   3 +
 src/include/utils/selfuncs.h           |   2 +-
 4 files changed, 178 insertions(+), 143 deletions(-)

diff --git a/src/backend/utils/adt/array_selfuncs.c b/src/backend/utils/adt/array_selfuncs.c
index 9207a5ed193..56ce460cf05 100644
--- a/src/backend/utils/adt/array_selfuncs.c
+++ b/src/backend/utils/adt/array_selfuncs.c
@@ -93,7 +93,7 @@ scalararraysel_containment(PlannerInfo *root,
 	/*
 	 * rightop must be a variable, else punt.
 	 */
-	examine_variable(root, rightop, varRelid, &vardata);
+	examine_variable(root, rightop, varRelid, NULL, &vardata);
 	if (!vardata.rel)
 	{
 		ReleaseVariableStats(vardata);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c4fcd0076ea..072579dbfd5 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -153,7 +153,7 @@ static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
 							  bool isdefault1, bool isdefault2,
 							  AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 							  Form_pg_statistic stats1, Form_pg_statistic stats2,
-							  bool have_mcvs1, bool have_mcvs2);
+							  bool have_mcvs1, bool have_mcvs2, double *unmatched_frac);
 static double eqjoinsel_semi(Oid opfuncoid, Oid collation,
 							 VariableStatData *vardata1, VariableStatData *vardata2,
 							 double nd1, double nd2,
@@ -1513,7 +1513,7 @@ boolvarsel(PlannerInfo *root, Node *arg, int varRelid)
 	VariableStatData vardata;
 	double		selec;
 
-	examine_variable(root, arg, varRelid, &vardata);
+	examine_variable(root, arg, varRelid, NULL, &vardata);
 	if (HeapTupleIsValid(vardata.statsTuple))
 	{
 		/*
@@ -1541,110 +1541,20 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
 {
 	VariableStatData vardata;
 	double		selec;
+	double		freq_null;
+	AttStatsSlot sslot;
 
-	examine_variable(root, arg, varRelid, &vardata);
+	examine_variable(root, arg, varRelid, sjinfo, &vardata);
 
-	if (HeapTupleIsValid(vardata.statsTuple))
+	if (sjinfo)
+	{
+		freq_null = sjinfo->unmatched_frac;
+	}
+	else if(HeapTupleIsValid(vardata.statsTuple))
 	{
 		Form_pg_statistic stats;
-		double		freq_null;
-		AttStatsSlot sslot;
-
 		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
 		freq_null = stats->stanullfrac;
-
-		if (get_attstatsslot(&sslot, vardata.statsTuple,
-							 STATISTIC_KIND_MCV, InvalidOid,
-							 ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS)
-			&& sslot.nnumbers > 0)
-		{
-			double		freq_true;
-			double		freq_false;
-
-			/*
-			 * Get first MCV frequency and derive frequency for true.
-			 */
-			if (DatumGetBool(sslot.values[0]))
-				freq_true = sslot.numbers[0];
-			else
-				freq_true = 1.0 - sslot.numbers[0] - freq_null;
-
-			/*
-			 * Next derive frequency for false. Then use these as appropriate
-			 * to derive frequency for each case.
-			 */
-			freq_false = 1.0 - freq_true - freq_null;
-
-			switch (booltesttype)
-			{
-				case IS_UNKNOWN:
-					/* select only NULL values */
-					selec = freq_null;
-					break;
-				case IS_NOT_UNKNOWN:
-					/* select non-NULL values */
-					selec = 1.0 - freq_null;
-					break;
-				case IS_TRUE:
-					/* select only TRUE values */
-					selec = freq_true;
-					break;
-				case IS_NOT_TRUE:
-					/* select non-TRUE values */
-					selec = 1.0 - freq_true;
-					break;
-				case IS_FALSE:
-					/* select only FALSE values */
-					selec = freq_false;
-					break;
-				case IS_NOT_FALSE:
-					/* select non-FALSE values */
-					selec = 1.0 - freq_false;
-					break;
-				default:
-					elog(ERROR, "unrecognized booltesttype: %d",
-						 (int) booltesttype);
-					selec = 0.0;	/* Keep compiler quiet */
-					break;
-			}
-
-			free_attstatsslot(&sslot);
-		}
-		else
-		{
-			/*
-			 * No most-common-value info available. Still have null fraction
-			 * information, so use it for IS [NOT] UNKNOWN. Otherwise adjust
-			 * for null fraction and assume a 50-50 split of TRUE and FALSE.
-			 */
-			switch (booltesttype)
-			{
-				case IS_UNKNOWN:
-					/* select only NULL values */
-					selec = freq_null;
-					break;
-				case IS_NOT_UNKNOWN:
-					/* select non-NULL values */
-					selec = 1.0 - freq_null;
-					break;
-				case IS_TRUE:
-				case IS_FALSE:
-					/* Assume we select half of the non-NULL values */
-					selec = (1.0 - freq_null) / 2.0;
-					break;
-				case IS_NOT_TRUE:
-				case IS_NOT_FALSE:
-					/* Assume we select NULLs plus half of the non-NULLs */
-					/* equiv. to freq_null + (1.0 - freq_null) / 2.0 */
-					selec = (freq_null + 1.0) / 2.0;
-					break;
-				default:
-					elog(ERROR, "unrecognized booltesttype: %d",
-						 (int) booltesttype);
-					selec = 0.0;	/* Keep compiler quiet */
-					break;
-			}
-		}
 	}
 	else
 	{
@@ -1680,8 +1590,103 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
 				selec = 0.0;	/* Keep compiler quiet */
 				break;
 		}
+		goto end;
+	}
+
+	if (vardata.statsTuple && get_attstatsslot(&sslot, vardata.statsTuple,
+							STATISTIC_KIND_MCV, InvalidOid,
+							ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS)
+		&& sslot.nnumbers > 0)
+	{
+		double		freq_true;
+		double		freq_false;
+
+		/*
+			* Get first MCV frequency and derive frequency for true.
+			*/
+		if (DatumGetBool(sslot.values[0]))
+			freq_true = sslot.numbers[0];
+		else
+			freq_true = 1.0 - sslot.numbers[0] - freq_null;
+
+		/*
+			* Next derive frequency for false. Then use these as appropriate
+			* to derive frequency for each case.
+			*/
+		freq_false = 1.0 - freq_true - freq_null;
+
+		switch (booltesttype)
+		{
+			case IS_UNKNOWN:
+				/* select only NULL values */
+				selec = freq_null;
+				break;
+			case IS_NOT_UNKNOWN:
+				/* select non-NULL values */
+				selec = 1.0 - freq_null;
+				break;
+			case IS_TRUE:
+				/* select only TRUE values */
+				selec = freq_true;
+				break;
+			case IS_NOT_TRUE:
+				/* select non-TRUE values */
+				selec = 1.0 - freq_true;
+				break;
+			case IS_FALSE:
+				/* select only FALSE values */
+				selec = freq_false;
+				break;
+			case IS_NOT_FALSE:
+				/* select non-FALSE values */
+				selec = 1.0 - freq_false;
+				break;
+			default:
+				elog(ERROR, "unrecognized booltesttype: %d",
+						(int) booltesttype);
+				selec = 0.0;	/* Keep compiler quiet */
+				break;
+		}
+
+		free_attstatsslot(&sslot);
+	}
+	else
+	{
+		/*
+			* No most-common-value info available. Still have null fraction
+			* information, so use it for IS [NOT] UNKNOWN. Otherwise adjust
+			* for null fraction and assume a 50-50 split of TRUE and FALSE.
+			*/
+		switch (booltesttype)
+		{
+			case IS_UNKNOWN:
+				/* select only NULL values */
+				selec = freq_null;
+				break;
+			case IS_NOT_UNKNOWN:
+				/* select non-NULL values */
+				selec = 1.0 - freq_null;
+				break;
+			case IS_TRUE:
+			case IS_FALSE:
+				/* Assume we select half of the non-NULL values */
+				selec = (1.0 - freq_null) / 2.0;
+				break;
+			case IS_NOT_TRUE:
+			case IS_NOT_FALSE:
+				/* Assume we select NULLs plus half of the non-NULLs */
+				/* equiv. to freq_null + (1.0 - freq_null) / 2.0 */
+				selec = (freq_null + 1.0) / 2.0;
+				break;
+			default:
+				elog(ERROR, "unrecognized booltesttype: %d",
+						(int) booltesttype);
+				selec = 0.0;	/* Keep compiler quiet */
+				break;
+		}
 	}
 
+	end:
 	ReleaseVariableStats(vardata);
 
 	/* result should be in range, but make sure... */
@@ -1699,39 +1704,19 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 {
 	VariableStatData vardata;
 	double		selec;
+	double		freq_null;
 
-	examine_variable(root, arg, varRelid, &vardata);
+	examine_variable(root, arg, varRelid, sjinfo, &vardata);
 
-	if (HeapTupleIsValid(vardata.statsTuple))
+	if (sjinfo)
+	{
+		freq_null = sjinfo->unmatched_frac;
+	}
+	else if (HeapTupleIsValid(vardata.statsTuple))
 	{
 		Form_pg_statistic stats;
-		double		freq_null;
-
 		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
 		freq_null = stats->stanullfrac;
-
-		switch (nulltesttype)
-		{
-			case IS_NULL:
-
-				/*
-				 * Use freq_null directly.
-				 */
-				selec = freq_null;
-				break;
-			case IS_NOT_NULL:
-
-				/*
-				 * Select not unknown (not null) values. Calculate from
-				 * freq_null.
-				 */
-				selec = 1.0 - freq_null;
-				break;
-			default:
-				elog(ERROR, "unrecognized nulltesttype: %d",
-					 (int) nulltesttype);
-				return (Selectivity) 0; /* keep compiler quiet */
-		}
 	}
 	else if (vardata.var && IsA(vardata.var, Var) &&
 			 ((Var *) vardata.var)->varattno < 0)
@@ -1741,6 +1726,7 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 		 * NULL.
 		 */
 		selec = (nulltesttype == IS_NULL) ? 0.0 : 1.0;
+		goto end;
 	}
 	else
 	{
@@ -1760,8 +1746,33 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 					 (int) nulltesttype);
 				return (Selectivity) 0; /* keep compiler quiet */
 		}
+		goto end;
+	}
+
+	switch (nulltesttype)
+	{
+		case IS_NULL:
+
+			/*
+				* Use freq_null directly.
+				*/
+			selec = freq_null;
+			break;
+		case IS_NOT_NULL:
+
+			/*
+				* Select not unknown (not null) values. Calculate from
+				* freq_null.
+				*/
+			selec = 1.0 - freq_null;
+			break;
+		default:
+			elog(ERROR, "unrecognized nulltesttype: %d",
+					(int) nulltesttype);
+			return (Selectivity) 0; /* keep compiler quiet */
 	}
 
+	end:
 	ReleaseVariableStats(vardata);
 
 	/* result should be in range, but make sure... */
@@ -2319,7 +2330,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
 								  isdefault1, isdefault2,
 								  &sslot1, &sslot2,
 								  stats1, stats2,
-								  have_mcvs1, have_mcvs2);
+								  have_mcvs1, have_mcvs2, &sjinfo->unmatched_frac);
 
 	switch (sjinfo->jointype)
 	{
@@ -2407,7 +2418,8 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 				bool isdefault1, bool isdefault2,
 				AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 				Form_pg_statistic stats1, Form_pg_statistic stats2,
-				bool have_mcvs1, bool have_mcvs2)
+				bool have_mcvs1, bool have_mcvs2,
+				double *unmatched_frac)
 {
 	double		selec;
 
@@ -2503,6 +2515,9 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		}
 		CLAMP_PROBABILITY(matchfreq1);
 		CLAMP_PROBABILITY(unmatchfreq1);
+
+		*unmatched_frac = unmatchfreq1;
+
 		matchfreq2 = unmatchfreq2 = 0.0;
 		for (i = 0; i < sslot2->nvalues; i++)
 		{
@@ -2581,10 +2596,21 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		double		nullfrac2 = stats2 ? stats2->stanullfrac : 0.0;
 
 		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2);
-		if (nd1 > nd2)
-			selec /= nd1;
+		/*
+		 * XXX Should this look at nullfrac on either side? Probably depends on
+		 * if we're calculating fraction of NULLs or fraction of unmatched rows.
+		 */
+		// unmatchfreq = (1.0 - nullfrac1) * (1.0 - nullfrac2);
+		if (nd1 != nd2)
+		{
+			selec /= Max(nd1, nd2);
+			*unmatched_frac = abs(nd1 - nd2) * 1.0 / Max(nd1, nd2);
+		}
 		else
+		{
 			selec /= nd2;
+			*unmatched_frac = 0.0;
+		}
 	}
 
 	return selec;
@@ -2964,8 +2990,8 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 		return;					/* shouldn't happen */
 
 	/* Look for stats for the inputs */
-	examine_variable(root, left, 0, &leftvar);
-	examine_variable(root, right, 0, &rightvar);
+	examine_variable(root, left, 0, NULL, &leftvar);
+	examine_variable(root, right, 0, NULL, &rightvar);
 
 	/* Extract the operator's declared left/right datatypes */
 	get_op_opfamily_properties(opno, opfamily, false,
@@ -3469,7 +3495,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
 		 * independent, than to combine estimates for the extracted variables
 		 * when we don't know how that relates to the expressions.
 		 */
-		examine_variable(root, groupexpr, 0, &vardata);
+		examine_variable(root, groupexpr, 0, NULL, &vardata);
 		if (HeapTupleIsValid(vardata.statsTuple) || vardata.isunique)
 		{
 			varinfos = add_unique_group_var(root, varinfos,
@@ -3510,7 +3536,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
 		{
 			Node	   *var = (Node *) lfirst(l2);
 
-			examine_variable(root, var, 0, &vardata);
+			examine_variable(root, var, 0, NULL, &vardata);
 			varinfos = add_unique_group_var(root, varinfos, var, &vardata);
 			ReleaseVariableStats(vardata);
 		}
@@ -3777,7 +3803,7 @@ estimate_hash_bucket_stats(PlannerInfo *root, Node *hashkey, double nbuckets,
 	bool		isdefault;
 	AttStatsSlot sslot;
 
-	examine_variable(root, hashkey, 0, &vardata);
+	examine_variable(root, hashkey, 0, NULL, &vardata);
 
 	/* Look up the frequency of the most common value, if available */
 	*mcv_freq = 0.0;
@@ -4865,8 +4891,8 @@ get_restriction_variable(PlannerInfo *root, List *args, int varRelid,
 	 * Examine both sides.  Note that when varRelid is nonzero, Vars of other
 	 * relations will be treated as pseudoconstants.
 	 */
-	examine_variable(root, left, varRelid, vardata);
-	examine_variable(root, right, varRelid, &rdata);
+	examine_variable(root, left, varRelid, NULL, vardata);
+	examine_variable(root, right, varRelid, NULL, &rdata);
 
 	/*
 	 * If one side is a variable and the other not, we win.
@@ -4919,8 +4945,8 @@ get_join_variables(PlannerInfo *root, List *args, SpecialJoinInfo *sjinfo,
 	left = (Node *) linitial(args);
 	right = (Node *) lsecond(args);
 
-	examine_variable(root, left, 0, vardata1);
-	examine_variable(root, right, 0, vardata2);
+	examine_variable(root, left, 0, NULL, vardata1);
+	examine_variable(root, right, 0, NULL, vardata2);
 
 	if (vardata1->rel &&
 		bms_is_subset(vardata1->rel->relids, sjinfo->syn_righthand))
@@ -4975,12 +5001,13 @@ ReleaseDummy(HeapTuple tuple)
  * Caller is responsible for doing ReleaseVariableStats() before exiting.
  */
 void
-examine_variable(PlannerInfo *root, Node *node, int varRelid,
+examine_variable(PlannerInfo *root, Node *node, int varRelid, SpecialJoinInfo *sjinfo,
 				 VariableStatData *vardata)
 {
 	Node	   *basenode;
 	Relids		varnos;
 	RelOptInfo *onerel;
+	Form_pg_statistic stats;
 
 	/* Make sure we don't return dangling pointers in vardata */
 	MemSet(vardata, 0, sizeof(VariableStatData));
@@ -5240,6 +5267,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 				break;
 		}
 
+		if (HeapTupleIsValid(vardata->statsTuple) && sjinfo)
+		{
+			stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+			sjinfo->unmatched_frac = stats->stanullfrac + sjinfo->unmatched_frac - stats->stanullfrac * sjinfo->unmatched_frac;
+		}
 		/*
 		 * Search extended statistics for one with a matching expression.
 		 * There might be multiple ones, so just grab the first one. In the
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c17b53f7adb..6bc63e648e6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2853,6 +2853,9 @@ struct SpecialJoinInfo
 	bool		semi_can_hash;	/* true if semi_operators are all hash */
 	List	   *semi_operators; /* OIDs of equality join operators */
 	List	   *semi_rhs_exprs; /* righthand-side expressions of these ops */
+
+	/* For outer join, fraction of rows without a match. */
+	Selectivity	unmatched_frac;
 };
 
 /*
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 2f76c473db4..d9418f7541c 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -148,7 +148,7 @@ extern PGDLLIMPORT get_index_stats_hook_type get_index_stats_hook;
 /* Functions in selfuncs.c */
 
 extern void examine_variable(PlannerInfo *root, Node *node, int varRelid,
-							 VariableStatData *vardata);
+								SpecialJoinInfo *sjinfo, VariableStatData *vardata);
 extern bool statistic_proc_security_check(VariableStatData *vardata, Oid func_oid);
 extern bool get_restriction_variable(PlannerInfo *root, List *args,
 									 int varRelid,
-- 
2.34.1

#12

Alena Rybakina

lena.ribackina@yandex.ru

over 2 years ago

In reply to: Alena Rybakina (#11)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

Hi,

I'm still working on it, but, unfortunately, I didn't have much time to
work with it well enough that there would be something that could be shown.
Now I am trying to sort out the problems that I drew attention to in the
previous letter.

--
Regards,
Alena Rybakina
Postgres Professional

#13

Andrei Lepikhov

lepihov@gmail.com

9 months ago

In reply to: Tomas Vondra (#3)

1 attachment(s)

Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

On 6/24/23 13:23, Tomas Vondra wrote:

I really hope what I just wrote makes at least a little bit of sense.

Here is a continuation of your work:
1. non-matched estimation sophisticated to provide meaningful numbers.
2. unmatched_frac is stored in sjinfo that let us to summarise these
generated nulls along the tree of outer joins.
3. unmatched_frac is used not only for nulltest estimation but also for
correction of stanullfrac and stadistinct values (improves GROUP-BY).
4. EXPLAIN prints number of estimated and actually generated NULLs in
outer-join nodes.

This patch no less ugly than your, may be even more. Tambien, not sure
it became more meaningful. But it helps being applied to some reported
cases I have ;).

--
regards, Andrei Lepikhov

Attachments:

0001-Consider-number-of-nulls-generated-by-inner-side-of-.patchtext/x-patch; charset=UTF-8; name=0001-Consider-number-of-nulls-generated-by-inner-side-of-.patchDownload

From 938c311d76404a22d96f3fcff406fe6c7e2f59f7 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepihov@gmail.com>
Date: Thu, 17 Apr 2025 14:33:02 +0200
Subject: [PATCH] Consider number of nulls generated by inner side of an
 outer-join.

Nullable side of an outer join may provide significant fraction of
null values replacing real numbers. Having no information about that in
statistics, planner may increase estimation error, because base parameters
like stanullfrac and stadistinct are out of real values.

In this patch we employ varnullingrels field in attempt to remember generated
nulls during query planning.
---
 contrib/intarray/_int_selfuncs.c              |   5 +-
 src/backend/commands/explain.c                |  24 ++
 src/backend/executor/nodeHashjoin.c           |   3 +
 src/backend/executor/nodeMergejoin.c          |   2 +
 src/backend/executor/nodeNestloop.c           |   1 +
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/createplan.c       |   2 +
 src/backend/optimizer/plan/initsplan.c        |   1 +
 src/backend/optimizer/util/pathnode.c         |   9 +
 src/backend/utils/adt/like_support.c          |   5 +-
 .../utils/adt/multirangetypes_selfuncs.c      |   4 +-
 src/backend/utils/adt/network_selfuncs.c      |  18 +-
 src/backend/utils/adt/rangetypes_selfuncs.c   |   4 +-
 src/backend/utils/adt/selfuncs.c              | 325 +++++++++++++++---
 src/backend/utils/misc/guc_tables.c           |  10 +
 src/include/nodes/execnodes.h                 |   1 +
 src/include/nodes/pathnodes.h                 |   9 +
 src/include/nodes/plannodes.h                 |   2 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/utils/selfuncs.h                  |   2 +
 src/test/regress/expected/stats.out           |  24 ++
 src/test/regress/sql/stats.sql                |  18 +
 22 files changed, 402 insertions(+), 69 deletions(-)

diff --git a/contrib/intarray/_int_selfuncs.c b/contrib/intarray/_int_selfuncs.c
index 6c3b7ace146..c8d8f4e42f4 100644
--- a/contrib/intarray/_int_selfuncs.c
+++ b/contrib/intarray/_int_selfuncs.c
@@ -188,10 +188,7 @@ _int_matchsel(PG_FUNCTION_ARGS)
 	 */
 	if (HeapTupleIsValid(vardata.statsTuple))
 	{
-		Form_pg_statistic stats;
-
-		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
-		nullfrac = stats->stanullfrac;
+		nullfrac = compute_stanullfrac(&vardata, NULL);
 
 		/*
 		 * For an int4 array, the default array type analyze function will
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 786ee865f14..d4ddaa2db32 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2273,6 +2273,30 @@ ExplainNode(PlanState *planstate, List *ancestors,
 			break;
 	}
 
+	if (nodeTag(plan) == T_NestLoop || nodeTag(plan) == T_MergeJoin ||
+		nodeTag(plan) == T_HashJoin)
+	{
+		JoinState *js = (JoinState *) planstate;
+		Join *join = (Join *) plan;
+
+		if (join->unmatched_frac >= 0.0)
+		{
+			/*
+			 * There are a probability of null tuples existed. Show how
+			 * precisely the planner have done its job.
+			 * XXX: The type of output value is debatable. Would it
+			 * be better to show relative values? For now, current format just
+			 * simpler.
+			 */
+			ExplainPropertyFloat("Planned unmatched rows", NULL,
+							join->unmatched_frac * join->plan.plan_rows, 0, es);
+
+			if (es->analyze)
+				ExplainPropertyFloat("Actual unmatched rows", NULL,
+												js->unmatched_tuples, 0, es);
+		}
+	}
+
 	/*
 	 * Prepare per-worker JIT instrumentation.  As with the overall JIT
 	 * summary, this is printed only if printing costs is enabled.
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 5661ad76830..a7be7a9f2b8 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -615,7 +615,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					econtext->ecxt_innertuple = node->hj_NullInnerTupleSlot;
 
 					if (otherqual == NULL || ExecQual(otherqual, econtext))
+					{
+						node->js.unmatched_tuples++;
 						return ExecProject(node->js.ps.ps_ProjInfo);
+					}
 					else
 						InstrCountFiltered2(node, 1);
 				}
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index a233313128a..69c0a4b2396 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -668,6 +668,7 @@ ExecMergeJoin(PlanState *pstate)
 							 */
 							TupleTableSlot *result;
 
+							node->js.unmatched_tuples++;
 							result = MJFillOuter(node);
 							if (result)
 								return result;
@@ -723,6 +724,7 @@ ExecMergeJoin(PlanState *pstate)
 							 */
 							TupleTableSlot *result;
 
+							node->js.unmatched_tuples++;
 							result = MJFillInner(node);
 							if (result)
 								return result;
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 5cd1a251625..26e9537d187 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -188,6 +188,7 @@ ExecNestLoop(PlanState *pstate)
 					 */
 					ENL1_printf("qualification succeeded, projecting tuple");
 
+					node->js.unmatched_tuples++;
 					return ExecProject(node->js.ps.ps_ProjInfo);
 				}
 				else
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 60b0fcfb6be..b877de37134 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -163,6 +163,7 @@ bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
 bool		enable_presorted_aggregate = true;
 bool		enable_async_append = true;
+bool		compute_innerside_nulls = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index a8f22a8c154..a6ca458e22b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1104,6 +1104,8 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path)
 			break;
 	}
 
+	((Join *) plan)->unmatched_frac = best_path->unmatch_frac;
+
 	/*
 	 * If there are any pseudoconstant clauses attached to this node, insert a
 	 * gating Result node that evaluates the pseudoconstants as one-time
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 01804b085b3..25e30786605 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -1765,6 +1765,7 @@ make_outerjoininfo(PlannerInfo *root,
 	sjinfo->commute_above_r = NULL;
 	sjinfo->commute_below_l = NULL;
 	sjinfo->commute_below_r = NULL;
+	sjinfo->unmatched_frac = -1; /* Undefined */
 
 	compute_semijoin_info(root, sjinfo, clause);
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 93e73cb44db..0d6ec5586be 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2605,6 +2605,9 @@ create_nestloop_path(PlannerInfo *root,
 
 	final_cost_nestloop(root, pathnode, workspace, extra);
 
+	pathnode->jpath.unmatch_frac = extra->sjinfo->ojrelid > 0 ?
+										extra->sjinfo->unmatched_frac : -1.0;
+
 	return pathnode;
 }
 
@@ -2674,6 +2677,9 @@ create_mergejoin_path(PlannerInfo *root,
 
 	final_cost_mergejoin(root, pathnode, workspace, extra);
 
+	pathnode->jpath.unmatch_frac = extra->sjinfo->ojrelid > 0 ?
+										extra->sjinfo->unmatched_frac : -1.0;
+
 	return pathnode;
 }
 
@@ -2748,6 +2754,9 @@ create_hashjoin_path(PlannerInfo *root,
 
 	final_cost_hashjoin(root, pathnode, workspace, extra);
 
+	pathnode->jpath.unmatch_frac = extra->sjinfo->ojrelid > 0 ?
+										extra->sjinfo->unmatched_frac : -1.0;
+
 	return pathnode;
 }
 
diff --git a/src/backend/utils/adt/like_support.c b/src/backend/utils/adt/like_support.c
index 8fdc677371f..e91874eff31 100644
--- a/src/backend/utils/adt/like_support.c
+++ b/src/backend/utils/adt/like_support.c
@@ -603,10 +603,7 @@ patternsel_common(PlannerInfo *root,
 	 */
 	if (HeapTupleIsValid(vardata.statsTuple))
 	{
-		Form_pg_statistic stats;
-
-		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
-		nullfrac = stats->stanullfrac;
+		nullfrac = compute_stanullfrac(&vardata, NULL);
 	}
 
 	/*
diff --git a/src/backend/utils/adt/multirangetypes_selfuncs.c b/src/backend/utils/adt/multirangetypes_selfuncs.c
index b87bcf3ea30..24a1fa3800a 100644
--- a/src/backend/utils/adt/multirangetypes_selfuncs.c
+++ b/src/backend/utils/adt/multirangetypes_selfuncs.c
@@ -302,11 +302,9 @@ calc_multirangesel(TypeCacheEntry *typcache, VariableStatData *vardata,
 	 */
 	if (HeapTupleIsValid(vardata->statsTuple))
 	{
-		Form_pg_statistic stats;
 		AttStatsSlot sslot;
 
-		stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
-		null_frac = stats->stanullfrac;
+		null_frac = compute_stanullfrac(vardata, NULL);
 
 		/* Try to get fraction of empty multiranges */
 		if (get_attstatsslot(&sslot, vardata->statsTuple,
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index 940cdafa546..05332d378aa 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -89,7 +89,6 @@ networksel(PG_FUNCTION_ARGS)
 				mcv_selec,
 				non_mcv_selec;
 	Datum		constvalue;
-	Form_pg_statistic stats;
 	AttStatsSlot hslot;
 	double		sumcommon,
 				nullfrac;
@@ -127,8 +126,7 @@ networksel(PG_FUNCTION_ARGS)
 		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
 	}
 
-	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
-	nullfrac = stats->stanullfrac;
+	nullfrac = compute_stanullfrac(&vardata, NULL);
 
 	/*
 	 * If we have most-common-values info, add up the fractions of the MCV
@@ -263,7 +261,6 @@ static Selectivity
 networkjoinsel_inner(Oid operator,
 					 VariableStatData *vardata1, VariableStatData *vardata2)
 {
-	Form_pg_statistic stats;
 	double		nullfrac1 = 0.0,
 				nullfrac2 = 0.0;
 	Selectivity selec = 0.0,
@@ -283,8 +280,7 @@ networkjoinsel_inner(Oid operator,
 
 	if (HeapTupleIsValid(vardata1->statsTuple))
 	{
-		stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple);
-		nullfrac1 = stats->stanullfrac;
+		nullfrac1 = compute_stanullfrac(vardata1, NULL);
 
 		mcv1_exists = get_attstatsslot(&mcv1_slot, vardata1->statsTuple,
 									   STATISTIC_KIND_MCV, InvalidOid,
@@ -305,8 +301,7 @@ networkjoinsel_inner(Oid operator,
 
 	if (HeapTupleIsValid(vardata2->statsTuple))
 	{
-		stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple);
-		nullfrac2 = stats->stanullfrac;
+		nullfrac2 = compute_stanullfrac(vardata2, NULL);
 
 		mcv2_exists = get_attstatsslot(&mcv2_slot, vardata2->statsTuple,
 									   STATISTIC_KIND_MCV, InvalidOid,
@@ -390,7 +385,6 @@ static Selectivity
 networkjoinsel_semi(Oid operator,
 					VariableStatData *vardata1, VariableStatData *vardata2)
 {
-	Form_pg_statistic stats;
 	Selectivity selec = 0.0,
 				sumcommon1 = 0.0,
 				sumcommon2 = 0.0;
@@ -413,8 +407,7 @@ networkjoinsel_semi(Oid operator,
 
 	if (HeapTupleIsValid(vardata1->statsTuple))
 	{
-		stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple);
-		nullfrac1 = stats->stanullfrac;
+		nullfrac1 = compute_stanullfrac(vardata1, NULL);
 
 		mcv1_exists = get_attstatsslot(&mcv1_slot, vardata1->statsTuple,
 									   STATISTIC_KIND_MCV, InvalidOid,
@@ -435,8 +428,7 @@ networkjoinsel_semi(Oid operator,
 
 	if (HeapTupleIsValid(vardata2->statsTuple))
 	{
-		stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple);
-		nullfrac2 = stats->stanullfrac;
+		nullfrac2 = compute_stanullfrac(vardata2, NULL);
 
 		mcv2_exists = get_attstatsslot(&mcv2_slot, vardata2->statsTuple,
 									   STATISTIC_KIND_MCV, InvalidOid,
diff --git a/src/backend/utils/adt/rangetypes_selfuncs.c b/src/backend/utils/adt/rangetypes_selfuncs.c
index d126abc5a82..0bfc68baac6 100644
--- a/src/backend/utils/adt/rangetypes_selfuncs.c
+++ b/src/backend/utils/adt/rangetypes_selfuncs.c
@@ -241,11 +241,9 @@ calc_rangesel(TypeCacheEntry *typcache, VariableStatData *vardata,
 	 */
 	if (HeapTupleIsValid(vardata->statsTuple))
 	{
-		Form_pg_statistic stats;
 		AttStatsSlot sslot;
 
-		stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
-		null_frac = stats->stanullfrac;
+		null_frac = compute_stanullfrac(vardata, NULL);
 
 		/* Try to get fraction of empty ranges */
 		if (get_attstatsslot(&sslot, vardata->statsTuple,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 987f2154459..5f3e26e5235 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -156,6 +156,13 @@ static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
 							  AttStatsSlot *sslot1, AttStatsSlot *sslot2,
 							  Form_pg_statistic stats1, Form_pg_statistic stats2,
 							  bool have_mcvs1, bool have_mcvs2);
+static double eqjoinsel_unmatch_left(Oid opfuncoid, Oid collation,
+									 VariableStatData *vardata1, VariableStatData *vardata2,
+									 double nd1, double nd2,
+									 bool isdefault1, bool isdefault2,
+									 AttStatsSlot *sslot1, AttStatsSlot *sslot2,
+									 Form_pg_statistic stats1, Form_pg_statistic stats2,
+									 bool have_mcvs1, bool have_mcvs2);
 static double eqjoinsel_semi(Oid opfuncoid, Oid collation,
 							 VariableStatData *vardata1, VariableStatData *vardata2,
 							 double nd1, double nd2,
@@ -319,10 +326,7 @@ var_eq_const(VariableStatData *vardata, Oid oproid, Oid collation,
 	 */
 	if (HeapTupleIsValid(vardata->statsTuple))
 	{
-		Form_pg_statistic stats;
-
-		stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
-		nullfrac = stats->stanullfrac;
+		nullfrac = compute_stanullfrac(vardata, NULL);
 	}
 
 	/*
@@ -481,10 +485,7 @@ var_eq_non_const(VariableStatData *vardata, Oid oproid, Oid collation,
 	 */
 	if (HeapTupleIsValid(vardata->statsTuple))
 	{
-		Form_pg_statistic stats;
-
-		stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
-		nullfrac = stats->stanullfrac;
+		nullfrac = compute_stanullfrac(vardata, NULL);
 	}
 
 	/*
@@ -997,10 +998,7 @@ generic_restriction_selectivity(PlannerInfo *root, Oid oproid, Oid collation,
 			selec = 0.9999;
 
 		/* Don't forget to account for nulls. */
-		if (HeapTupleIsValid(vardata.statsTuple))
-			nullfrac = ((Form_pg_statistic) GETSTRUCT(vardata.statsTuple))->stanullfrac;
-		else
-			nullfrac = 0.0;
+		nullfrac = compute_stanullfrac(&vardata, NULL);
 
 		/*
 		 * Now merge the results from the MCV and histogram calculations,
@@ -1552,12 +1550,10 @@ booltestsel(PlannerInfo *root, BoolTestType booltesttype, Node *arg,
 
 	if (HeapTupleIsValid(vardata.statsTuple))
 	{
-		Form_pg_statistic stats;
 		double		freq_null;
 		AttStatsSlot sslot;
 
-		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
-		freq_null = stats->stanullfrac;
+		freq_null = compute_stanullfrac(&vardata, NULL);
 
 		if (get_attstatsslot(&sslot, vardata.statsTuple,
 							 STATISTIC_KIND_MCV, InvalidOid,
@@ -1710,11 +1706,12 @@ nulltestsel(PlannerInfo *root, NullTestType nulltesttype, Node *arg,
 
 	if (HeapTupleIsValid(vardata.statsTuple))
 	{
-		Form_pg_statistic stats;
 		double		freq_null;
 
-		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
-		freq_null = stats->stanullfrac;
+		freq_null = compute_stanullfrac(&vardata, NULL);
+
+		if (sjinfo)
+			freq_null = freq_null + sjinfo->unmatched_frac - freq_null * sjinfo->unmatched_frac;
 
 		switch (nulltesttype)
 		{
@@ -2287,6 +2284,7 @@ eqjoinsel(PG_FUNCTION_ARGS)
 	Oid			collation = PG_GET_COLLATION();
 	double		selec;
 	double		selec_inner;
+	double		unmatched_frac;
 	VariableStatData vardata1;
 	VariableStatData vardata2;
 	double		nd1;
@@ -2359,6 +2357,26 @@ eqjoinsel(PG_FUNCTION_ARGS)
 								  stats1, stats2,
 								  have_mcvs1, have_mcvs2);
 
+	/*
+	 * calculate fraction of right without of matching row on left
+	 *
+	 * FIXME Should be restricted to JOIN_LEFT, we should have similar logic
+	 * for JOIN_FULL.
+	 *
+	 * XXX Probably should calculate unmatched as fraction of the join result,
+	 * not of the relation on the right (because the matched part can have more
+	 * matches per row and thus grow). Not sure. Depends on how it's used later.
+	 */
+	unmatched_frac = eqjoinsel_unmatch_left(opfuncoid, collation,
+											&vardata1, &vardata2,
+											nd1, nd2,
+											isdefault1, isdefault2,
+											&sslot1, &sslot2,
+											stats1, stats2,
+											have_mcvs1, have_mcvs2);
+
+	sjinfo->unmatched_frac = unmatched_frac;
+
 	switch (sjinfo->jointype)
 	{
 		case JOIN_INNER:
@@ -2467,8 +2485,8 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 		FmgrInfo	eqproc;
 		bool	   *hasmatch1;
 		bool	   *hasmatch2;
-		double		nullfrac1 = stats1->stanullfrac;
-		double		nullfrac2 = stats2->stanullfrac;
+		double		nullfrac1 = compute_stanullfrac(vardata1, NULL);
+		double		nullfrac2 = compute_stanullfrac(vardata2, NULL);
 		double		matchprodfreq,
 					matchfreq1,
 					matchfreq2,
@@ -2628,6 +2646,135 @@ eqjoinsel_inner(Oid opfuncoid, Oid collation,
 	return selec;
 }
 
+/*
+ * eqjoinsel_unmatch_left
+ *
+ * XXX Mostly copy paste of eqjoinsel_inner, but calculates unmatched fraction
+ * of the relation on the left. See eqjoinsel_inner for more comments.
+ *
+ * XXX Maybe we could/should integrate this into eqjoinsel_inner, so that we
+ * don't have to walk the MCV lists twice?
+ */
+static double
+eqjoinsel_unmatch_left(Oid opfuncoid, Oid collation,
+					   VariableStatData *vardata1, VariableStatData *vardata2,
+					   double nd1, double nd2,
+					   bool isdefault1, bool isdefault2,
+					   AttStatsSlot *sslot1, AttStatsSlot *sslot2,
+					   Form_pg_statistic stats1, Form_pg_statistic stats2,
+					   bool have_mcvs1, bool have_mcvs2)
+{
+	double		unmatchfreq;
+	double		nullfrac1 = compute_stanullfrac(vardata1, NULL);
+
+	if (!compute_innerside_nulls)
+		return 0.0;
+
+	if (have_mcvs1 && have_mcvs2)
+	{
+		LOCAL_FCINFO(fcinfo, 2);
+		FmgrInfo	eqproc;
+		bool	   *hasmatch1;
+		bool	   *hasmatch2;
+		double		matchfreq1,
+					unmatchfreq1;
+		int			i,
+					nmatches;
+
+		fmgr_info(opfuncoid, &eqproc);
+
+		/*
+		 * Save a few cycles by setting up the fcinfo struct just once. Using
+		 * FunctionCallInvoke directly also avoids failure if the eqproc
+		 * returns NULL, though really equality functions should never do
+		 * that.
+		 */
+		InitFunctionCallInfoData(*fcinfo, &eqproc, 2, collation,
+								 NULL, NULL);
+		fcinfo->args[0].isnull = false;
+		fcinfo->args[1].isnull = false;
+
+		hasmatch1 = (bool *) palloc0(sslot1->nvalues * sizeof(bool));
+		hasmatch2 = (bool *) palloc0(sslot2->nvalues * sizeof(bool));
+
+		/*
+		 * Note we assume that each MCV will match at most one member of the
+		 * other MCV list.  If the operator isn't really equality, there could
+		 * be multiple matches --- but we don't look for them, both for speed
+		 * and because the math wouldn't add up...
+		 */
+		nmatches = 0;
+		for (i = 0; i < sslot1->nvalues; i++)
+		{
+			int			j;
+
+			fcinfo->args[0].value = sslot1->values[i];
+
+			for (j = 0; j < sslot2->nvalues; j++)
+			{
+				Datum	fresult;
+
+				if (hasmatch2[j])
+					/* Already matched. Impossible to find more candidates */
+					continue;
+
+				fcinfo->args[1].value = sslot2->values[j];
+				fcinfo->isnull = false;
+				fresult = FunctionCallInvoke(fcinfo);
+				if (!fcinfo->isnull && DatumGetBool(fresult))
+				{
+					hasmatch1[i] = hasmatch2[j] = true;
+					nmatches++;
+					break;
+				}
+			}
+		}
+
+		/* Sum up frequencies of matched and unmatched MCVs */
+		matchfreq1 = unmatchfreq1 = 0.0;
+		for (i = 0; i < sslot1->nvalues; i++)
+		{
+			if (hasmatch1[i])
+				matchfreq1 += sslot1->numbers[i];
+			else
+				unmatchfreq1 += sslot1->numbers[i];
+		}
+		CLAMP_PROBABILITY(matchfreq1);
+		CLAMP_PROBABILITY(unmatchfreq1);
+		pfree(hasmatch1);
+		pfree(hasmatch2);
+
+		unmatchfreq = 1.0 - nullfrac1 - matchfreq1;
+		CLAMP_PROBABILITY(unmatchfreq);
+
+		nd1 -= nmatches;
+		nd2 -= nmatches;
+
+		/* XXX Keep an eye on the cases where ndistinct less than zero */
+		nd1 = clamp_row_est(nd1);
+		nd2 = clamp_row_est(nd2);
+
+		if (nd1 > nd2)
+			unmatchfreq *= (nd1 - nd2) / nd1;
+		else
+			unmatchfreq *= (nd2 - nd1) / nd2;
+	}
+	else
+	{
+		/*
+		 * XXX Should this look at nullfrac on either side? Probably depends on
+		 * if we're calculating fraction of NULLs or fraction of unmatched rows.
+		 */
+		unmatchfreq = (1.0 - nullfrac1);
+		if (nd1 > nd2)
+			unmatchfreq *= (nd1 - nd2) / nd1;
+		else
+			unmatchfreq *= (nd2 - nd1) / nd2;
+	}
+
+	return unmatchfreq;
+}
+
 /*
  * eqjoinsel_semi --- eqjoinsel for semi join
  *
@@ -2694,7 +2841,7 @@ eqjoinsel_semi(Oid opfuncoid, Oid collation,
 		FmgrInfo	eqproc;
 		bool	   *hasmatch1;
 		bool	   *hasmatch2;
-		double		nullfrac1 = stats1->stanullfrac;
+		double		nullfrac1 = compute_stanullfrac(vardata1, NULL);
 		double		matchfreq1,
 					uncertainfrac,
 					uncertain;
@@ -2854,15 +3001,11 @@ neqjoinsel(PG_FUNCTION_ARGS)
 		VariableStatData leftvar;
 		VariableStatData rightvar;
 		bool		reversed;
-		HeapTuple	statsTuple;
 		double		nullfrac;
 
 		get_join_variables(root, args, sjinfo, &leftvar, &rightvar, &reversed);
-		statsTuple = reversed ? rightvar.statsTuple : leftvar.statsTuple;
-		if (HeapTupleIsValid(statsTuple))
-			nullfrac = ((Form_pg_statistic) GETSTRUCT(statsTuple))->stanullfrac;
-		else
-			nullfrac = 0.0;
+		nullfrac = compute_stanullfrac(reversed ? &rightvar : &leftvar, NULL);
+
 		ReleaseVariableStats(leftvar);
 		ReleaseVariableStats(rightvar);
 
@@ -3226,22 +3369,18 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 */
 	if (nulls_first)
 	{
-		Form_pg_statistic stats;
-
 		if (HeapTupleIsValid(leftvar.statsTuple))
 		{
-			stats = (Form_pg_statistic) GETSTRUCT(leftvar.statsTuple);
-			*leftstart += stats->stanullfrac;
+			*leftstart += compute_stanullfrac(&leftvar, NULL);
 			CLAMP_PROBABILITY(*leftstart);
-			*leftend += stats->stanullfrac;
+			*leftend += compute_stanullfrac(&leftvar, NULL);
 			CLAMP_PROBABILITY(*leftend);
 		}
 		if (HeapTupleIsValid(rightvar.statsTuple))
 		{
-			stats = (Form_pg_statistic) GETSTRUCT(rightvar.statsTuple);
-			*rightstart += stats->stanullfrac;
+			*rightstart += compute_stanullfrac(&rightvar, NULL);
 			CLAMP_PROBABILITY(*rightstart);
-			*rightend += stats->stanullfrac;
+			*rightend += compute_stanullfrac(&rightvar, NULL);
 			CLAMP_PROBABILITY(*rightend);
 		}
 	}
@@ -4049,10 +4188,7 @@ estimate_hash_bucket_stats(PlannerInfo *root, Node *hashkey, double nbuckets,
 	/* Get fraction that are null */
 	if (HeapTupleIsValid(vardata.statsTuple))
 	{
-		Form_pg_statistic stats;
-
-		stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
-		stanullfrac = stats->stanullfrac;
+		stanullfrac = compute_stanullfrac(&vardata, NULL);
 	}
 	else
 		stanullfrac = 0.0;
@@ -5196,6 +5332,109 @@ ReleaseDummy(HeapTuple tuple)
 	pfree(tuple);
 }
 
+static int
+sjinfo_cmp(const ListCell *lc1, const ListCell *lc2)
+{
+	SpecialJoinInfo	*sjinfo1 = lfirst_node(SpecialJoinInfo, lc1);
+	SpecialJoinInfo	*sjinfo2 = lfirst_node(SpecialJoinInfo, lc2);
+	int num1 = bms_num_members(
+					bms_union(sjinfo1->min_lefthand, sjinfo1->min_righthand));
+	int num2 = bms_num_members(
+					bms_union(sjinfo2->min_lefthand, sjinfo2->min_righthand));
+
+	if (num1 > num2)
+		return 1;
+	else if (num1 == num2)
+		return 0;
+	else
+		return -1;
+}
+
+double
+compute_stanullfrac(VariableStatData *vardata, double *gen_frac)
+{
+	Form_pg_statistic	stats;
+	Var				   *var;
+	PlannerInfo		   *root = vardata->root;
+	List			   *sjinfos = NIL;
+	double				nullfrac;
+
+	if (gen_frac != NULL)
+		*gen_frac = 0.0;
+
+	if (!HeapTupleIsValid(vardata->statsTuple) || vardata->var == NULL)
+		return 0.0;
+
+	if (IsA(vardata->var, RelabelType))
+		var = (Var *) (((RelabelType *) vardata->var)->arg);
+	else
+		var = (Var *) vardata->var;
+
+	stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
+	nullfrac = stats->stanullfrac;
+
+	if (!compute_innerside_nulls)
+		return nullfrac;
+
+	if (root == NULL || !IsA(var, Var) ||
+		bms_is_empty(var->varnullingrels))
+		return nullfrac;
+
+	/* Find all outer joins that can null this column */
+	foreach_node(SpecialJoinInfo, sjinfo, vardata->root->join_info_list)
+	{
+		RangeTblEntry *rte;
+
+		if (!OidIsValid(sjinfo->ojrelid) ||
+			!bms_is_member(sjinfo->ojrelid, var->varnullingrels))
+			continue;
+
+		Assert(OidIsValid(sjinfo->ojrelid));
+
+		/*
+		 * varnullingrels may refer not only outer joins - we need to filter
+		 * other objects in advance.
+		 */
+
+		rte = root->simple_rte_array[sjinfo->ojrelid];
+
+		if (rte->rtekind != RTE_JOIN)
+			continue;
+
+		Assert(sjinfo->ojrelid < root->simple_rel_array_size &&
+			root->simple_rel_array[sjinfo->ojrelid] == NULL);
+		Assert(bms_is_member(sjinfo->ojrelid, root->outer_join_rels));
+
+		sjinfos = lappend(sjinfos, sjinfo);
+		break;
+	}
+	list_sort(sjinfos, sjinfo_cmp);
+
+	foreach_node(SpecialJoinInfo, sjinfo, sjinfos)
+	{
+		double unmatched_frac;
+
+		if (sjinfo->unmatched_frac <= 0.0)
+			continue;
+
+		unmatched_frac = sjinfo->unmatched_frac - (sjinfo->unmatched_frac * nullfrac);
+		nullfrac = sjinfo->unmatched_frac + nullfrac -
+										(sjinfo->unmatched_frac * nullfrac);
+		Assert(nullfrac >= 0. && nullfrac <= 1.);
+
+
+		if (gen_frac != NULL)
+		{
+			*gen_frac = *gen_frac + unmatched_frac -
+										(unmatched_frac * *gen_frac);
+
+			Assert(*gen_frac >= 0.0 && *gen_frac <= 1.0);
+		}
+	}
+
+	return nullfrac;
+}
+
 /*
  * examine_variable
  *		Try to look up statistical data about an expression.
@@ -5261,6 +5500,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 		Var		   *var = (Var *) basenode;
 
 		/* Set up result fields other than the stats tuple */
+		vardata->root = root;
 		vardata->var = basenode;	/* return Var without relabeling */
 		vardata->rel = find_base_rel(root, var->varno);
 		vardata->atttype = var->vartype;
@@ -6096,6 +6336,7 @@ get_variable_numdistinct(VariableStatData *vardata, bool *isdefault)
 {
 	double		stadistinct;
 	double		stanullfrac = 0.0;
+	double		generatednullsfrac = 0.0;
 	double		ntuples;
 
 	*isdefault = false;
@@ -6113,7 +6354,7 @@ get_variable_numdistinct(VariableStatData *vardata, bool *isdefault)
 
 		stats = (Form_pg_statistic) GETSTRUCT(vardata->statsTuple);
 		stadistinct = stats->stadistinct;
-		stanullfrac = stats->stanullfrac;
+		stanullfrac = compute_stanullfrac(vardata, &generatednullsfrac);
 	}
 	else if (vardata->vartype == BOOLOID)
 	{
@@ -6200,7 +6441,7 @@ get_variable_numdistinct(VariableStatData *vardata, bool *isdefault)
 	 * If we had a relative estimate, use that.
 	 */
 	if (stadistinct < 0.0)
-		return clamp_row_est(-stadistinct * ntuples);
+		return clamp_row_est(-stadistinct * ntuples * (1.0 - generatednullsfrac));
 
 	/*
 	 * With no data, estimate ndistinct = ntuples if the table is small, else
@@ -6328,7 +6569,7 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata,
 
 			for (i = 0; i < sslot.nnumbers; i++)
 				sumcommon += sslot.numbers[i];
-			nullfrac = ((Form_pg_statistic) GETSTRUCT(vardata->statsTuple))->stanullfrac;
+			nullfrac = compute_stanullfrac(vardata, NULL);
 			if (sumcommon + nullfrac > 0.99999)
 				use_mcvs = true;
 		}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 60b12446a1c..45e82894b39 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1016,6 +1016,16 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"compute_innerside_nulls", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables considering NULL values have appeared because of unmatched side of OJ"),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&compute_innerside_nulls,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_group_by_reordering", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables reordering of GROUP BY keys."),
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5b6cadb5a6c..aad097dd57b 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2155,6 +2155,7 @@ typedef struct JoinState
 	bool		single_match;	/* True if we should skip to next outer tuple
 								 * after finding one inner match */
 	ExprState  *joinqual;		/* JOIN quals (in addition to ps.qual) */
+	double		unmatched_tuples;
 } JoinState;
 
 /* ----------------
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index bb678bdcdcd..a49661af91f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2218,6 +2218,12 @@ typedef struct JoinPath
 	 * joinrestrictinfo is needed in JoinPath, and can't be merged into the
 	 * parent RelOptInfo.
 	 */
+
+	/*
+	 * Must be -1.0 'not initialized' for non-outer joins. In case of outer join
+	 * it contains fraction of not matched tuples.
+	 */
+	Selectivity unmatch_frac;
 } JoinPath;
 
 /*
@@ -3043,6 +3049,9 @@ struct SpecialJoinInfo
 	bool		semi_can_hash;	/* true if semi_operators are all hash */
 	List	   *semi_operators; /* OIDs of equality join operators */
 	List	   *semi_rhs_exprs; /* righthand-side expressions of these ops */
+
+	/* For outer join, fraction of rows without a match. */
+	Selectivity	unmatched_frac;
 };
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 658d76225e4..884e5bf8b27 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -929,6 +929,8 @@ typedef struct Join
 	bool		inner_unique;
 	/* JOIN quals (in addition to plan.qual) */
 	List	   *joinqual;
+
+	Selectivity	unmatched_frac;
 } Join;
 
 /* ----------------
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index c5987440817..6b4c3c1eb6a 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -71,6 +71,7 @@ extern PGDLLIMPORT bool enable_partition_pruning;
 extern PGDLLIMPORT bool enable_presorted_aggregate;
 extern PGDLLIMPORT bool enable_async_append;
 extern PGDLLIMPORT int constraint_exclusion;
+extern PGDLLIMPORT bool compute_innerside_nulls;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
 								  double index_pages, PlannerInfo *root);
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 013049b3098..0e865773c5e 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -97,6 +97,7 @@ typedef struct VariableStatData
 	bool		isunique;		/* matches unique index, DISTINCT or GROUP-BY
 								 * clause */
 	bool		acl_ok;			/* result of ACL check on table or column */
+	PlannerInfo *root;
 } VariableStatData;
 
 #define ReleaseVariableStats(vardata)  \
@@ -151,6 +152,7 @@ extern PGDLLIMPORT get_index_stats_hook_type get_index_stats_hook;
 
 /* Functions in selfuncs.c */
 
+extern double compute_stanullfrac(VariableStatData *vardata, double *gen_frac);
 extern void examine_variable(PlannerInfo *root, Node *node, int varRelid,
 							 VariableStatData *vardata);
 extern bool statistic_proc_security_check(VariableStatData *vardata, Oid func_oid);
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 776f1ad0e53..14083893190 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1868,4 +1868,28 @@ SELECT * FROM check_estimated_rows('SELECT * FROM table_fillfactor');
 (1 row)
 
 DROP TABLE table_fillfactor;
+CREATE TABLE test (x integer);
+INSERT INTO test (SELECT gs FROM generate_series(1,10000) AS gs);
+CREATE INDEX ON test (x);
+VACUUM ANALYZE test;
+-- SET enable_hashagg = off; -- avoid mem size unstable report
+SET compute_innerside_nulls = off;
+SELECT * FROM check_estimated_rows('
+  SELECT t2.x FROM test t1 LEFT JOIN test t2 ON (t1.x=-t2.x) GROUP BY t2.x;
+');
+ estimated | actual 
+-----------+--------
+     10000 |      1
+(1 row)
+
+SET compute_innerside_nulls = on;
+SELECT * FROM check_estimated_rows('
+  SELECT t2.x FROM test t1 LEFT JOIN test t2 ON (t1.x=-t2.x) GROUP BY t2.x;
+');
+ estimated | actual 
+-----------+--------
+       200 |      1
+(1 row)
+
+DROP TABLE IF EXISTS test;
 -- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 232ab8db8fa..408b65c20a8 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -925,4 +925,22 @@ SELECT * FROM check_estimated_rows('SELECT * FROM table_fillfactor');
 
 DROP TABLE table_fillfactor;
 
+CREATE TABLE test (x integer);
+INSERT INTO test (SELECT gs FROM generate_series(1,10000) AS gs);
+CREATE INDEX ON test (x);
+VACUUM ANALYZE test;
+
+-- SET enable_hashagg = off; -- avoid mem size unstable report
+
+SET compute_innerside_nulls = off;
+SELECT * FROM check_estimated_rows('
+  SELECT t2.x FROM test t1 LEFT JOIN test t2 ON (t1.x=-t2.x) GROUP BY t2.x;
+');
+
+SET compute_innerside_nulls = on;
+SELECT * FROM check_estimated_rows('
+  SELECT t2.x FROM test t1 LEFT JOIN test t2 ON (t1.x=-t2.x) GROUP BY t2.x;
+');
+DROP TABLE IF EXISTS test;
+
 -- End of Stats Test
-- 
2.39.5