psql12.3 + jdbc_fdw - return wrong query results by using OR

Started by Emi Lualmost 6 years ago7 messagesgeneral
Jump to latest
#1Emi Lu
emilu@encs.concordia.ca

Hello,

psql12.3 + jdbc_fdw(oracle18.x), tried:

[1]: select count(*) from oracle_t1 as a inner join local_t1 as b on (a.c1 = b.c1);
select count(*)
from oracle_t1 as a
inner join local_t1 as b
on (a.c1 = b.c1);

return 100 records

[2]: select count(*) from oracle_t1 as a inner join local_t1 as b on (a.c1 = b.c1 OR a.c2 = b.c1 );
select count(*)
from oracle_t1 as a
inner join local_t1 as b
on (a.c1 = b.c1
OR
a.c2 = b.c1
);

return only 2 records

(no null values in both tables.)

May I know what may cause the error please?

Thanks a lot.

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Emi Lu (#1)
Re: psql12.3 + jdbc_fdw - return wrong query results by using OR

On 5/28/20 8:39 AM, emilu@encs.concordia.ca wrote:

Hello,

psql12.3 + jdbc_fdw(oracle18.x), tried:

[1]
select count(*)
from oracle_t1 as a
inner join local_t1 as b
on (a.c1 = b.c1);

return 100 records

[2]
select count(*)
from oracle_t1 as a
inner join local_t1 as b
on (a.c1 = b.c1
    OR
    a.c2 = b.c1
   );

return only 2 records

(no null values in both tables.)

May I know what may cause the error please?

I'm guessing you are seeing this:

https://www.postgresql.org/docs/12/sql-expressions.html

4.2.14. Expression Evaluation Rules

". Boolean expressions (AND/OR/NOT combinations) in those clauses can be
reorganized in any manner allowed by the laws of Boolean algebra."

It would help to see the EXPLAIN ANALYZE for the queries above.

Thanks a lot.

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Emi Lu (#1)
Re: psql12.3 + jdbc_fdw - return wrong query results by using OR

On 5/28/20 9:59 AM, Ying Lu wrote:

Hello,

would help to see the EXPLAIN ANALYZE for the queries above.

And the EXPLAIN ANALYZE for the first query?

Also please include the entire query, for example:

EXPLAIN ANALYZE select count(*) from oracle_t1 as a inner join local_t1
as b on (a.c1 = b.c1 OR a.c2 = b.c1);

This is important because in below I see:

Filter: (yr= '2020'::text)

which I don't see in your original queries.

Please find the explain analyze info
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..736.49 rows=489 width=333) (actual time=313.495..1224.671 rows=9 loops=1)
Join Filter: ((a.c1 = b.c1) OR (a.c2 = b.c1))
Rows Removed by Join Filter: 2106
-> Seq Scan on local_t1 b (cost=0.00..1.49 rows=49 width=13) (actual time=0.016..0.029 rows=49 loops=1)
-> Foreign Scan on oracle_t1 a (cost=0.00..0.00 rows=1000 width=320) (actual time=3.445..24.977 rows=43 loops=49)
Filter: (yr= '2020'::text)
Rows Removed by Filter: 255
Planning Time: 0.532 ms
Execution Time: 1327.697 ms

Thanks.
________________________________________

psql12.3 + jdbc_fdw(oracle18.x), tried:
[1]
select count(*)
from oracle_t1 as a
inner join local_t1 as b
on (a.c1 = b.c1);

return 100 records

[2]
select count(*)
from oracle_t1 as a
inner join local_t1 as b
on (a.c1 = b.c1
OR
a.c2 = b.c1
);

return only 2 records

(no null values in both tables.)

May I know what may cause the error please?

I'm guessing you are seeing this:

https://www.postgresql.org/docs/12/sql-expressions.html

4.2.14. Expression Evaluation Rules

". Boolean expressions (AND/OR/NOT combinations) in those clauses can be
reorganized in any manner allowed by the laws of Boolean algebra."

It would help to see the EXPLAIN ANALYZE for the queries above.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Ying Lu
emi.lu@concordia.ca
In reply to: Adrian Klaver (#3)
Re: psql12.3 + jdbc_fdw - return wrong query results by using OR

Hello,

Please find the info for both SQLs (removed yr)

For Q1:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=5.40..5.41 rows=1 width=8) (actual time=1267.001..1267.001 rows=1 loops=1)
-> Hash Join (cost=2.10..4.78 rows=245 width=0) (actual time=0.134..1265.840 rows=2650 loops=1)
Hash Cond: (a.c1 = b.c1)
-> Foreign Scan on oracle_t1 a (cost=0.00..0.00 rows=1000 width=32) (actual time=0.026..1257.823 rows=14625 loops=1)
-> Hash (cost=1.49..1.49 rows=49 width=5) (actual time=0.030..0.030 rows=49 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on local_t1 b (cost=0.00..1.49 rows=49 width=5) (actual time=0.014..0.021 rows=49 loops=1)
Planning Time: 0.178 ms
Execution Time: 1363.482 ms

For Q2:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=737.71..737.72 rows=1 width=8) (actual time=1197.366..1197.366 rows=1 loops=1)
-> Nested Loop (cost=0.00..736.49 rows=489 width=0) (actual time=16.649..1197.292 rows=70 loops=1)
Join Filter: ((a.c1 = b.c1) OR (a.c2 = b.c1))
Rows Removed by Join Filter: 14555
-> Seq Scan on local_t1 b (cost=0.00..1.49 rows=49 width=5) (actual time=0.016..0.023 rows=49 loops=1)
-> Foreign Scan on oracle_t1 a (cost=0.00..0.00 rows=1000 width=64) (actual time=0.002..24.284 rows=298 loops=49)
Planning Time: 0.972 ms
Execution Time: 1299.896 ms

Thanks a lot.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ying Lu (#4)
Re: psql12.3 + jdbc_fdw - return wrong query results by using OR

Ying Lu <emi.lu@concordia.ca> writes:

For Q1:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=5.40..5.41 rows=1 width=8) (actual time=1267.001..1267.001 rows=1 loops=1)
-> Hash Join (cost=2.10..4.78 rows=245 width=0) (actual time=0.134..1265.840 rows=2650 loops=1)
Hash Cond: (a.c1 = b.c1)
-> Foreign Scan on oracle_t1 a (cost=0.00..0.00 rows=1000 width=32) (actual time=0.026..1257.823 rows=14625 loops=1)
-> Hash (cost=1.49..1.49 rows=49 width=5) (actual time=0.030..0.030 rows=49 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on local_t1 b (cost=0.00..1.49 rows=49 width=5) (actual time=0.014..0.021 rows=49 loops=1)
Planning Time: 0.178 ms
Execution Time: 1363.482 ms

For Q2:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=737.71..737.72 rows=1 width=8) (actual time=1197.366..1197.366 rows=1 loops=1)
-> Nested Loop (cost=0.00..736.49 rows=489 width=0) (actual time=16.649..1197.292 rows=70 loops=1)
Join Filter: ((a.c1 = b.c1) OR (a.c2 = b.c1))
Rows Removed by Join Filter: 14555
-> Seq Scan on local_t1 b (cost=0.00..1.49 rows=49 width=5) (actual time=0.016..0.023 rows=49 loops=1)
-> Foreign Scan on oracle_t1 a (cost=0.00..0.00 rows=1000 width=64) (actual time=0.002..24.284 rows=298 loops=49)
Planning Time: 0.972 ms
Execution Time: 1299.896 ms

The numbers here are consistent with the theory that there are 14625 rows
in the foreign table, but when oracle_t1 is scanned on the inside of a
nest loop, the FDW returns all of them on the first scan and then forgets
to return any when rescanned. This'd be a bug in jdbc_fdw, and a
pretty bad one :-(. But you'd have to report it to the jdbc_fdw
author(s) --- the core Postgres project doesn't maintain that.

regards, tom lane

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Emi Lu (#1)
Re: psql12.3 + jdbc_fdw - return wrong query results by using OR

On Thu, 2020-05-28 at 11:39 -0400, emilu@encs.concordia.ca wrote:

psql12.3 + jdbc_fdw(oracle18.x), tried:

[got bad query results]

How about giving oracle_fdw a try?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#7Ying Lu
emi.lu@concordia.ca
In reply to: Laurenz Albe (#6)
Re: psql12.3 + jdbc_fdw - return wrong query results by using OR

psql12.3 + jdbc_fdw(oracle18.x), tried:

[got bad query results]

How about giving oracle_fdw a try?

I will try oracle_fdw then.

Thank you very much for everyone's help!