BUG #9896: Bug in FULL OUTER JOIN

Started by SQLproabout 12 years ago5 messagesbugs
Jump to latest
#1SQLpro
sqlpro@sqlspot.com

The following bug has been logged on the website:

Bug reference: 9896
Logged by: SQLpro
Email address: sqlpro@sqlspot.com
PostgreSQL version: 9.3.4
Operating system: Windows
Description:

According to ISO standard SQL there is no limitation in predicat ON for JOIN
operator.
But PostGreSQL is unable to join tables in the FULL OUTER mode when
predicate ON is not sargable.
here is an example :

CREATE TABLE T_CLIENT_CLI --> customers
(CLI_ID INT PRIMARY KEY,
CLI_SIREN CHAR(9) NOT NULL UNIQUE,
CLI_ENSEIGNE VARCHAR(16) NOT NULL);

CREATE TABLE T_PROSPECT_PSP --> sale perspectives
(PSP_ID INT PRIMARY KEY,
PSP_SIREN CHAR(9) NOT NULL UNIQUE,
PSP_ENSEIGNE VARCHAR(16) NOT NULL);

We want to join on SIREN (national French firm number) or, in the case of
mistake in this number, on the name (enseigne).

Some datas :

INSERT INTO T_CLIENT_CLI VALUES
(1, '123456789', 'IBM'),
(2, '111111111', 'Microsoft'),
(3, '999555111', 'SAP');

INSERT INTO T_PROSPECT_PSP VALUES
(101, '123456789', 'IBM'),
(102, '555555555', 'Microsoft'),
(103, '777777777', 'Amazon'),
(104, '444444444', 'Google');

This query answer exactly to the question :

SELECT *
FROM T_CLIENT_CLI AS C
FULL OUTER JOIN T_PROSPECT_PSP AS P
ON C.CLI_SIREN = P.PSP_SIREN
OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE;

The result would be :

CLI_ID CLI_SIREN CLI_ENSEIGNE PSP_ID PSP_SIREN PSP_ENSEIGNE
----------- --------- ---------------- ----------- ---------
----------------
1 123456789 IBM 101 123456789 IBM
2 111111111 Microsoft 102 555555555 Microsoft
3 999555111 SAP NULL NULL NULL
NULL NULL NULL 103 777777777 Amazon
NULL NULL NULL 104 444444444 Google

But PG throw an exception :

ERREUR: FULL JOIN is only supported with merge-joinable or hash-joinable
join conditions
État SQL :0A000

This is the case in version 9.1.2 too.
It seems that is the case in any version of PG...

By comparizon, this query guive the exact answer in :
- MS SQL Server since version 2000 to 2014 (2005, 2008/2008R2, 2012).
- Oracle 11G

Thanks

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: SQLpro (#1)
Re: BUG #9896: Bug in FULL OUTER JOIN

sqlpro@sqlspot.com writes:

SELECT *
FROM T_CLIENT_CLI AS C
FULL OUTER JOIN T_PROSPECT_PSP AS P
ON C.CLI_SIREN = P.PSP_SIREN
OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE;

But PG throw an exception :

ERREUR: FULL JOIN is only supported with merge-joinable or hash-joinable
join conditions

Yeah. The problem is the OR condition, which doesn't provide anything
that will work with either of those methods.

By comparizon, this query guive the exact answer in :
- MS SQL Server since version 2000 to 2014 (2005, 2008/2008R2, 2012).
- Oracle 11G

Interesting. I wonder where they keep the per-row match status?

Can you show us the query plans used by those systems?

In any case, don't hold your breath waiting for a fix; I'm just asking
to gather information for possible future work. This is unlikely to
be simple to fix, and it's not going to be very high on anyone's priority
list either, given that few people use FULL JOIN as far as I've heard.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #9896: Bug in FULL OUTER JOIN

Tom Lane-2 wrote

In any case, don't hold your breath waiting for a fix; I'm just asking
to gather information for possible future work. This is unlikely to
be simple to fix, and it's not going to be very high on anyone's priority
list either, given that few people use FULL JOIN as far as I've heard.

Infrequently, maybe, but especially for dataset comparisons FULL JOIN is
convenient. That said, typically reformulation of the data and/or query is
possible, even if somewhat slower, so it isn't like getting a result is
impossible given this limitation.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-9896-Bug-in-FULL-OUTER-JOIN-tp5799008p5799042.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: BUG #9896: Bug in FULL OUTER JOIN

I wrote:

sqlpro@sqlspot.com writes:

SELECT *
FROM T_CLIENT_CLI AS C
FULL OUTER JOIN T_PROSPECT_PSP AS P
ON C.CLI_SIREN = P.PSP_SIREN
OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE;

Can you show us the query plans used by those systems?

For the archives' sake: the OP sent me a not-too-useful screen shot
in which SQL Server claims it's using a merge join for this query.

I find this less than credible: what linear sort order would bring
together all the potentially joinable rows? If they're getting the
right answer at all, there must be some secret sauce in there someplace.
Perhaps they're just Doing It The Hard Way with state storage
proportional to the size of the relations?

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#4)
Re: BUG #9896: Bug in FULL OUTER JOIN

On 04/07/2014 10:17 PM, Tom Lane wrote:

I wrote:

sqlpro@sqlspot.com writes:

SELECT *
FROM T_CLIENT_CLI AS C
FULL OUTER JOIN T_PROSPECT_PSP AS P
ON C.CLI_SIREN = P.PSP_SIREN
OR C.CLI_ENSEIGNE = P.PSP_ENSEIGNE;

Can you show us the query plans used by those systems?

For the archives' sake: the OP sent me a not-too-useful screen shot
in which SQL Server claims it's using a merge join for this query.

I find this less than credible: what linear sort order would bring
together all the potentially joinable rows? If they're getting the
right answer at all, there must be some secret sauce in there someplace.
Perhaps they're just Doing It The Hard Way with state storage
proportional to the size of the relations?

And Oracle does it in two steps: http://www.sqlfiddle.com/#!4/607e0/1/0

--
Vik

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs