Non-trivial condition is only propagated to one side of JOIN
Hi,
using `PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit`, I've observed the
following behavior:
– keep in mind that this example is as simplified as possible, the
original query involves foreign tables, and the failure to propagate /
push down the condition results in a query plan that basically tries to
download the complete foreign table, which is not a feasible execution
strategy:
Setup:
CREATE TABLE tbl1 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id
INTEGER NOT NULL, data TEXT);
CREATE TABLE tbl2 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id
INTEGER NOT NULL, data TEXT);
CREATE INDEX ON tbl1 (site_id);
CREATE INDEX ON tbl2 (site_id);
Working queries:
SELECT * FROM tbl1 WHERE tbl1.site_id = 1; -- "trivial condition"
SELECT * FROM tbl2 WHERE tbl2.site_id = 1;
SELECT * FROM tbl1 WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; --
"non-trivial condition"
SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL;
1) Exemplary Query Plan:
# EXPLAIN SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL;
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on tbl2 (cost=8.40..19.08 rows=12 width=40)
Recheck Cond: ((site_id = 1) OR (site_id IS NULL))
-> BitmapOr (cost=8.40..8.40 rows=12 width=0)
-> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id = 1)
-> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id IS NULL)
(7 rows)
The key takeaway is, that the index can be used, because the condition
is propagated deep enough.
2) Still working example:
# EXPLAIN SELECT * FROM tbl1 LEFT JOIN tbl2 ON tbl2.site_id =
tbl1.site_id WHERE tbl1.site_id = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=8.40..27.80 rows=36 width=80)
-> Bitmap Heap Scan on tbl1 (cost=4.20..13.67 rows=6 width=40)
Recheck Cond: (site_id = 1)
-> Bitmap Index Scan on tbl1_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id = 1)
-> Materialize (cost=4.20..13.70 rows=6 width=40)
-> Bitmap Heap Scan on tbl2 (cost=4.20..13.67 rows=6 width=40)
Recheck Cond: (site_id = 1)
-> Bitmap Index Scan on tbl2_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id = 1)
(10 rows)
The condition is propagated into BOTH branches of the join. The join
could also be an INNER join and might also be realized as a Merge Join
or Hash Join: they all behave the same.
3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id
WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Hash Join (cost=19.23..46.45 rows=72 width=80)
Hash Cond: (tbl2.site_id = tbl1.site_id)
-> Seq Scan on tbl2 (cost=0.00..22.00 rows=1200 width=40)
-> Hash (cost=19.08..19.08 rows=12 width=40)
-> Bitmap Heap Scan on tbl1 (cost=8.40..19.08 rows=12 width=40)
Recheck Cond: ((site_id = 1) OR (site_id IS NULL))
-> BitmapOr (cost=8.40..8.40 rows=12 width=0)
-> Bitmap Index Scan on tbl1_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id = 1)
-> Bitmap Index Scan on tbl1_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id IS NULL)
(11 rows)
Now, a full seq scan used for tbl2, the condition is only pushed down on
ONE side of the JOIN!
(with `WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL`, the Seq Scan
would have been on tbl1... [not so easily demostrated w/ LEFT JOINs]).
Also, `ON tbl1.site_id IS NOT DISTINCT FROM tbl2.site_id` does not help,
The weird thing is: The subqueries on both sides of the join are
perfectly capable of accepting/using the "non-trivial" condition, as
demonstrated in 1), and JOINs are generally able to propagate conditions
to both sides, as demonstrated in 2).
Is there a magic knob to force postgres to do the right thing, or is
this basically a bug in the query planner?
Tobias
On Sunday, August 25, 2024, Tobias Hoffmann <ldev-list@thax.hardliners.org>
wrote:
3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id
WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
The “is null” predicate in this query is doing nothing as your next comment
alludes to; you will produce no rows out of the join with a null site_id
due to the use of the equals operator in the join.
Also, `ON tbl1.site_id IS NOT DISTINCT FROM tbl2.site_id` does not help,
Others may correct me but I’m guessing that indeed the optimizer has a gap
here that could be filled in, it’s just it feels like adding code to deal
with broken queries so isn’t overly motivated to work on. Joining using
distinct instead of equality is uncommon, since nearly all models join
primary keys to foreign keys and both of those are almost always non-null.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Sunday, August 25, 2024, Tobias Hoffmann <ldev-list@thax.hardliners.org>
wrote:3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id
WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
The “is null” predicate in this query is doing nothing as your next comment
alludes to; you will produce no rows out of the join with a null site_id
due to the use of the equals operator in the join.
Indeed. This WHERE clause might be useful with a left join to tbl1,
but then the IS NULL means something totally different and can *not*
be pushed down.
Others may correct me but I’m guessing that indeed the optimizer has a gap
here that could be filled in, it’s just it feels like adding code to deal
with broken queries so isn’t overly motivated to work on.
The short answer is that we expend quite a lot of effort to deduce
implied equality clauses from combinations of equality clauses;
that is, given "WHERE A = B AND B = C" we can deduce "A = C" as well.
No such deductions can be made from equality clauses that are buried
under an OR, because they might not be true for every join row.
Maybe some machinery could be built that would do something useful
with an OR clause of this form, but I doubt it would be useful
often enough to justify the development effort and additional
planner cycles.
An important point here is that "WHERE A = B AND p(A)" does not permit
us to deduce "p(B)" for arbitrary conditions p(), because we have some
equality operators that will return true for values that sort equal
but are distinguishable in other ways. (Handy example: in float8
arithmetic, zero and minus zero compare equal, as required by the IEEE
float spec.) We presently don't assume that this works for anything
other than transitive equality across equality operators of a single
btree operator family. In particular, I don't think we could assume
it for the given example, because "WHERE A = B AND A IS NULL" is
tautologically false. You can deduce anything from a falsehood,
so I'm not entirely sure that the proposed optimization is even
logically sound.
Joining using
distinct instead of equality is uncommon, since nearly all models join
primary keys to foreign keys and both of those are almost always non-null.
Yeah. I'll concede that we probably should work harder on building
out planner and executor support for IS NOT DISTINCT FROM. But again,
it just seems like a case that is not worth spending large amounts of
development time on.
regards, tom lane
On 25/08/2024 17:35, David G. Johnston wrote:
On Sunday, August 25, 2024, Tobias Hoffmann
<ldev-list@thax.hardliners.org> wrote:3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id =
tbl1.site_id WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;The “is null” predicate in this query is doing nothing as your next
comment alludes to; you will produce no rows out of the join with a
null site_id due to the use of the equals operator in the join.
Well, that's why I said: "keep in mind that this example is as
simplified as possible"...
Even though `tbl1.site_id = 1` is – in this case – completely equivalent
to `tbl1.site_id = 1 OR tbl1.site_id IS NULL` – the first one is
completely pushed down, but the second is not.
A more complete example might look more like this:
CREATE VIEW "subview1" AS
SELECT tbl1.site_id, ... JOIN ... ON tbl1.site_id = tbl2.site_id
WHERE ...;
CREATE VIEW "view1" AS
SELECT site_id, ... FROM subview1 -- maybe even: WHERE site_id IS
NOT NULL
UNION ALL
SELECT null, ...;
SELECT * FROM view1 WHERE (site_id = 1 OR site_id IS NULL);
The reason, why the outer query would have a more complicated condition
might have nothing to do with the subquery containing the JOIN.
(This is also not a `UNION ALL` special case: `site_id IS NULL` could
also be generated by a LEFT JOIN, e.g.)
But not pushing down the condition has the grave impact, that those -
otherwise working VIEWs (i.e. subview1) become "unfixably" broken, for
certain WHERE-conditions on the outside.
Another reason why I said `site_id INTEGER NOT NULL` and `IS NOT
DISTINCT FROM`, is that there might be some mathematical reason I'm not
yet aware of where the propagation would not be sound, but which would
not apply for arbitrary site_id [nullable, ...].
My primary goal is to find at least *some* way to get the condition
pushed further in to avoid the full table scan, and to not have to
completely rewrite all the VIEWs into a single big query, where I could
inject the site_id parameter (e.g. "$1") in multiple places as needed...
Tobias
Tobias Hoffmann <ldev-list@thax.hardliners.org> writes:
A more complete example might look more like this:
CREATE VIEW "subview1" AS
SELECT tbl1.site_id, ... JOIN ... ON tbl1.site_id = tbl2.site_id
WHERE ...;
CREATE VIEW "view1" AS
SELECT site_id, ... FROM subview1 -- maybe even: WHERE site_id IS
NOT NULL
UNION ALL
SELECT null, ...;
SELECT * FROM view1 WHERE (site_id = 1 OR site_id IS NULL);
For this particular case, you could probably get somewhere by
writing
SELECT * FROM view1 WHERE site_id = 1
UNION ALL
SELECT * FROM view1 WHERE site_id IS NULL;
since the sets of rows satisfying those two WHERE conditions
must be disjoint. (I recall working on a patch that essentially
tried to do that transformation automatically, but it eventually
failed because things get too messy if the row sets might not
be disjoint.)
regards, tom lane
On 25/08/2024 19:28, Tom Lane wrote:
For this particular case, you could probably get somewhere by
writingSELECT * FROM view1 WHERE site_id = 1
UNION ALL
SELECT * FROM view1 WHERE site_id IS NULL;
Thank you for your suggestion, Tom.
Unfortunately, as I now understand, nothing *except* `var = const` can
ever be propagated to the second branch of the join.
In particular, even just `WHERE site_id IS NULL` no longer propagates
like `WHERE site_id = 1` does.
Other cases, that do not propagate, include `WHERE site_id IN (1, 2)`.
I'll probably have to find another workaround to my current problem.
----
More generally, I think that the currently possible set is very
restrictive and affects not just edge-cases;
my SQL-Engine-implementation-fu is far from good enough for the
necessary changes, though.
Here are some more thoughts:
Maybe some machinery could be built that would do something useful
with an OR clause of this form,[...]
An important point here is that "WHERE A = B AND p(A)" does not permit
us to deduce "p(B)" for arbitrary conditions p(),
IMO the relevant equality should be the `ON tbl2.site_id IS NOT DISTINCT
FROM tbl1.site_id` (resp. `ON tbl2.site_id = tbl1.site_id`, but nulls
probably need special care), which should allow any predicate `p` only
depending on `tbl1.site_id` (i.e.`WHERE p(tbl1.site_id)`, from
"outside") to be pulled "inside" the INNER JOIN or LEFT JOIN, because no
row which does not satisfy `p(tbl1.site_id)`, and, via equivalence,
`p(tbl2.site_id)` could ever be part of the result.
More specifically, given a WHERE clause in CNF (i.e. `p0(...) AND
p1(...) AND (p2a(...) OR p2b(...)) AND ...`), every top-level term which
only uses variables which are deemed equivalent, should be allowed to
propagate.
If this is too difficult, not just single constants (`var = const`), but
sets of constants (`var = ANY(...)`) and/or especially `var IS NULL`)
should be considered.
Just my 2ct...
Tobias
You must use a where clause on the FDW table or you get a full load/SEQ scan of that table, per documentation.
Select * is not recommended for FDW tables.
From: Tobias Hoffmann <ldev-list@thax.hardliners.org>
Date: Sunday, August 25, 2024 at 8:10 AM
To: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Subject: Non-trivial condition is only propagated to one side of JOIN
Hi, using `PostgreSQL 16. 2 (Debian 16. 2-1. pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12. 2. 0-14) 12. 2. 0, 64-bit`, I've observed the following behavior: – keep in mind that this example is as simplified as possible, the original
Hi,
using `PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit`, I've observed the
following behavior:
– keep in mind that this example is as simplified as possible, the
original query involves foreign tables, and the failure to propagate /
push down the condition results in a query plan that basically tries to
download the complete foreign table, which is not a feasible execution
strategy:
Setup:
CREATE TABLE tbl1 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id
INTEGER NOT NULL, data TEXT);
CREATE TABLE tbl2 (id INTEGER GENERATED ALWAYS AS IDENTITY, site_id
INTEGER NOT NULL, data TEXT);
CREATE INDEX ON tbl1 (site_id);
CREATE INDEX ON tbl2 (site_id);
Working queries:
SELECT * FROM tbl1 WHERE tbl1.site_id = 1; -- "trivial condition"
SELECT * FROM tbl2 WHERE tbl2.site_id = 1;
SELECT * FROM tbl1 WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL; --
"non-trivial condition"
SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL;
1) Exemplary Query Plan:
# EXPLAIN SELECT * FROM tbl2 WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL;
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on tbl2 (cost=8.40..19.08 rows=12 width=40)
Recheck Cond: ((site_id = 1) OR (site_id IS NULL))
-> BitmapOr (cost=8.40..8.40 rows=12 width=0)
-> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id = 1)
-> Bitmap Index Scan on tbl2_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id IS NULL)
(7 rows)
The key takeaway is, that the index can be used, because the condition
is propagated deep enough.
2) Still working example:
# EXPLAIN SELECT * FROM tbl1 LEFT JOIN tbl2 ON tbl2.site_id =
tbl1.site_id WHERE tbl1.site_id = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=8.40..27.80 rows=36 width=80)
-> Bitmap Heap Scan on tbl1 (cost=4.20..13.67 rows=6 width=40)
Recheck Cond: (site_id = 1)
-> Bitmap Index Scan on tbl1_site_id_idx (cost=0.00..4.20
rows=6 width=0)
Index Cond: (site_id = 1)
-> Materialize (cost=4.20..13.70 rows=6 width=40)
-> Bitmap Heap Scan on tbl2 (cost=4.20..13.67 rows=6 width=40)
Recheck Cond: (site_id = 1)
-> Bitmap Index Scan on tbl2_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id = 1)
(10 rows)
The condition is propagated into BOTH branches of the join. The join
could also be an INNER join and might also be realized as a Merge Join
or Hash Join: they all behave the same.
3) Problematic example:
# EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id = tbl1.site_id
WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Hash Join (cost=19.23..46.45 rows=72 width=80)
Hash Cond: (tbl2.site_id = tbl1.site_id)
-> Seq Scan on tbl2 (cost=0.00..22.00 rows=1200 width=40)
-> Hash (cost=19.08..19.08 rows=12 width=40)
-> Bitmap Heap Scan on tbl1 (cost=8.40..19.08 rows=12 width=40)
Recheck Cond: ((site_id = 1) OR (site_id IS NULL))
-> BitmapOr (cost=8.40..8.40 rows=12 width=0)
-> Bitmap Index Scan on tbl1_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id = 1)
-> Bitmap Index Scan on tbl1_site_id_idx
(cost=0.00..4.20 rows=6 width=0)
Index Cond: (site_id IS NULL)
(11 rows)
Now, a full seq scan used for tbl2, the condition is only pushed down on
ONE side of the JOIN!
(with `WHERE tbl2.site_id = 1 OR tbl2.site_id IS NULL`, the Seq Scan
would have been on tbl1... [not so easily demostrated w/ LEFT JOINs]).
Also, `ON tbl1.site_id IS NOT DISTINCT FROM tbl2.site_id` does not help,
The weird thing is: The subqueries on both sides of the join are
perfectly capable of accepting/using the "non-trivial" condition, as
demonstrated in 1), and JOINs are generally able to propagate conditions
to both sides, as demonstrated in 2).
Is there a magic knob to force postgres to do the right thing, or is
this basically a bug in the query planner?
Tobias