Question about antijoin

Started by dandlalmost 10 years ago4 messagesgeneral
Jump to latest
#1dandl
david@andl.org

NOT EXISTS (SELECT NULL FROM dyr_pause_mot WHERE avlsnr = a.avlsnr)

This can be executed as anti-join and is often more efficient.

This got my interest! It's of great interest to me to know how and when Postgres performs an anti-join (this being a significant omission from SQL).

Is this a reliable trigger: (NOT EXISTS <subselect>)?

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2David Rowley
dgrowleyml@gmail.com
In reply to: dandl (#1)
Re: Question about antijoin

On 12 July 2016 at 12:41, dandl <david@andl.org> wrote:

NOT EXISTS (SELECT NULL FROM dyr_pause_mot WHERE avlsnr = a.avlsnr)

This can be executed as anti-join and is often more efficient.

This got my interest! It's of great interest to me to know how and when Postgres performs an anti-join (this being a significant omission from SQL).

Is this a reliable trigger: (NOT EXISTS <subselect>)?

No. If the subselect did not contain Vars from the outer select, then
the EXISTS or NOT EXISTS in this case wouldn't be a join at all.

create table a (id int primary key);
create table b (id int primary key);

explain select * from a where not exists(select * from b); -- no anti join

Or if a WHERE clause exists for the subquery and it contains a
volatile function, then the subquery will not be eligible to become an
anti-join:

explain select * from a where not exists(select * from b where
a.id=b.id and a.id > random());

See: convert_EXISTS_sublink_to_join() for details.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: dandl (#1)
Re: Question about antijoin

"dandl" <david@andl.org> writes:

This got my interest! It's of great interest to me to know how and when Postgres performs an anti-join (this being a significant omission from SQL).
Is this a reliable trigger: (NOT EXISTS <subselect>)?

That's one case; see convert_EXISTS_sublink_to_join() for the full set
of conditions involved. There is also a relevant transformation in
reduce_outer_joins():

* Another transformation we apply here is to recognize cases like
* SELECT ... FROM a LEFT JOIN b ON (a.x = b.y) WHERE b.y IS NULL;
* If the join clause is strict for b.y, then only null-extended rows could
* pass the upper WHERE, and we can conclude that what the query is really
* specifying is an anti-semijoin. We change the join type from JOIN_LEFT
* to JOIN_ANTI. The IS NULL clause then becomes redundant, and must be
* removed to prevent bogus selectivity calculations, but we leave it to
* distribute_qual_to_rels to get rid of such clauses.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4dandl
david@andl.org
In reply to: Tom Lane (#3)
Re: Question about antijoin

Thanks Tom and David

That's very useful. My interest for Andl is to be able to emit SQL that
Postgres will reliably interpret as an anti-join, in the absence of an
explicit form in SQL.

But your reference to "anti-semijoin" is interesting -- what is that? Is it
just another name for anti-join, or something different? Does Postgres have
one algorithm or two?

[And BTW that is a weird piece of SQL -- I guess people really do write
those things and you have to make the best of them you can.]

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Tom Lane
Sent: Wednesday, 13 July 2016 12:13 AM
To: dandl <david@andl.org>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Question about antijoin

"dandl" <david@andl.org> writes:

This got my interest! It's of great interest to me to know how and when

Postgres performs an anti-join (this being a significant omission from

SQL).

Is this a reliable trigger: (NOT EXISTS <subselect>)?

That's one case; see convert_EXISTS_sublink_to_join() for the full set of
conditions involved. There is also a relevant transformation in
reduce_outer_joins():

* Another transformation we apply here is to recognize cases like
* SELECT ... FROM a LEFT JOIN b ON (a.x = b.y) WHERE b.y IS

NULL;

* If the join clause is strict for b.y, then only null-extended rows

could

* pass the upper WHERE, and we can conclude that what the query is really
* specifying is an anti-semijoin. We change the join type from JOIN_LEFT
* to JOIN_ANTI. The IS NULL clause then becomes redundant, and must be
* removed to prevent bogus selectivity calculations, but we leave it to
* distribute_qual_to_rels to get rid of such clauses.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general