BUG #6335: Weird planner decision with exists (a join b) condition

Started by Maxim Bogukover 14 years ago7 messagesbugs

maxim.boguk@gmail.com

over 14 years ago

The following bug has been logged on the website:

Bug reference: 6335
Logged by: Maksym Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 9.0.4
Operating system: Linux Ubuntu
Description:

I was explored reasons of high DB load and I localized the next problem
query:

That is correct version:

EXPLAIN ANALYZE select *
from applicant_adv_subscription aas
where
aas.user_id in (5112699)
and exists (
SELECT * from resume
join resume_view_history using (resume_id)
where
resume.user_id = aas.user_id
);

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=0.00..20.51 rows=1 width=65) (actual
time=0.031..0.032 rows=1 loops=1)
-> Index Scan using applicant_adv_subscription_user_id_key on
applicant_adv_subscription aas (cost=0.00..0.02 rows=1 width=65) (actual
time=0.011..0.012 rows=1 loops=1)
Index Cond: (user_id = 5112699)
-> Nested Loop (cost=0.00..20.49 rows=3118 width=4) (actual
time=0.018..0.018 rows=1 loops=1)
-> Index Scan using resume_user_id_key on resume (cost=0.00..0.13
rows=18 width=8) (actual time=0.008..0.008 rows=1 loops=1)
Index Cond: (user_id = 5112699)
-> Index Scan using resume_view_history_fk73b63ccd36b06a5 on
resume_view_history (cost=0.00..0.95 rows=173 width=4) (actual
time=0.009..0.009 rows=1 loops=1)
Index Cond: (resume_view_history.resume_id =
resume.resume_id)
Total runtime: 0.080 ms

But once I add second value into IN list plan become completely screwed:

EXPLAIN analyze select *
from applicant_adv_subscription aas
where
aas.user_id in (5112699,7995496)
and exists (
SELECT * from resume
join resume_view_history using (resume_id)
where
resume.user_id = aas.user_id
);

QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=38967.39..735076.65 rows=2 width=65) (actual
time=14656.388..389866.211 rows=1 loops=1)
Join Filter: (aas.user_id = resume.user_id)
-> Bitmap Heap Scan on applicant_adv_subscription aas (cost=0.02..0.04
rows=2 width=65) (actual time=0.018..0.021 rows=2 loops=1)
Recheck Cond: (user_id = ANY ('{5112699,7995496}'::integer[]))
-> Bitmap Index Scan on applicant_adv_subscription_user_id_key
(cost=0.00..0.02 rows=2 width=0) (actual time=0.014..0.014 rows=2 loops=1)
Index Cond: (user_id = ANY ('{5112699,7995496}'::integer[]))
-> Hash Join (cost=38967.36..726839.23 rows=272203680 width=4) (actual
time=13267.456..182842.841 rows=136136926 loops=2)
Hash Cond: (resume_view_history.resume_id = resume.resume_id)
-> Seq Scan on resume_view_history (cost=0.00..282228.92
rows=272203680 width=4) (actual time=0.004..25574.666 rows=136161776
loops=2)
-> Hash (cost=21737.05..21737.05 rows=16110150 width=8) (actual
time=13260.145..13260.145 rows=16114222 loops=2)
Buckets: 2097152 Batches: 2 Memory Usage: 314776kB
-> Seq Scan on resume (cost=0.00..21737.05 rows=16110150
width=8) (actual time=0.005..8839.480 rows=16114222 loops=2)
Total runtime: 389866.374 ms

Ooops.

Changes in *_cost settings have no effect.
Disabling seq_scan/merge_join/hash_join doest not help as well.

Nothing special about tables, all required indexes on the place.

Alvaro Herrera

alvherre@2ndquadrant.com

over 14 years ago

In reply to: Maxim Boguk (#1)

Re: BUG #6335: Weird planner decision with exists (a join b) condition

Excerpts from maxim.boguk's message of mié dic 14 08:09:38 -0300 2011:

But once I add second value into IN list plan become completely screwed:

See here:
http://archives.postgresql.org/message-id/1309918036-sup-4092@alvh.no-ip.org
Perhaps it's a similar problem.

Maybe you'd get enthused enough to try to fix the problem?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Maxim Boguk

maxim.boguk@gmail.com

over 14 years ago

In reply to: Alvaro Herrera (#2)

Re: BUG #6335: Weird planner decision with exists (a join b) condition

Here goes self-contained test case.

I tested it on the 9.1.2, 9.1.1, 9.0.5, 9.0.4, 8.4.7
all of them affected by the problem:

select version();
drop table if exists test1;
drop table if exists test2;
drop table if exists test3;
CREATE TABLE test1 AS SELECT user_id FROM generate_series(1,1000000) AS
gs(user_id);
CREATE TABLE test2 AS SELECT user_id,user_id AS resume_id FROM
generate_series(1,1000000) AS gs(user_id);
CREATE TABLE test3 AS SELECT user_id AS resume_id FROM
generate_series(1,1000000) AS gs(user_id);
create index test1_user_id_key on test1(user_id);
create index test2_user_id_key on test2(user_id);
create index test2_resume_id_key on test2(resume_id);
create index test3_resume_id_key on test3(resume_id);
analyze test1;
analyze test2;
analyze test3;
--good
EXPLAIN ANALYZE select *
from test1
where
test1.user_id in (100)
and exists (
SELECT * from test2
join test3 using (resume_id)
where
test2.user_id = test1.user_id
);
--bad
EXPLAIN ANALYZE select *
from test1
where
test1.user_id in (100, 101)
and exists (
SELECT * from test2
join test3 using (resume_id)
where
test2.user_id = test1.user_id
);

On Thu, Dec 15, 2011 at 1:00 AM, Alvaro Herrera
<alvherre@commandprompt.com>wrote:

Excerpts from maxim.boguk's message of mié dic 14 08:09:38 -0300 2011:

But once I add second value into IN list plan become completely screwed:

See here:

http://archives.postgresql.org/message-id/1309918036-sup-4092@alvh.no-ip.org
Perhaps it's a similar problem.

Maybe you'd get enthused enough to try to fix the problem?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
If they can send one man to the moon... why can't they send them all?

МойКруг: http://mboguk.moikrug.ru/
Сила солому ломит, но не все в нашей жизни - солома, да и сила далеко не
все.

bricklen

bricklen@gmail.com

over 14 years ago

In reply to: Maxim Boguk (#3)

Re: BUG #6335: Weird planner decision with exists (a join b) condition

On Wed, Dec 14, 2011 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:

Here goes self-contained test case.

I tested it on the 9.1.2, 9.1.1, 9.0.5, 9.0.4, 8.4.7

I just tested on 9.1.2 and see the same issue.

--bad
EXPLAIN ANALYZE select *
from test1
where
test1.user_id in (100, 101)
and exists (
SELECT * from test2
join test3 using (resume_id)
where
test2.user_id = test1.user_id
);

Setting enable_hashjoin to false pushes it back to a good plan again.

Maxim Boguk

maxim.boguk@gmail.com

over 14 years ago

In reply to: bricklen (#4)

Re: BUG #6335: Weird planner decision with exists (a join b) condition

On Thu, Dec 15, 2011 at 12:00 PM, bricklen <bricklen@gmail.com> wrote:

On Wed, Dec 14, 2011 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com>
wrote:

Here goes self-contained test case.

I tested it on the 9.1.2, 9.1.1, 9.0.5, 9.0.4, 8.4.7

I just tested on 9.1.2 and see the same issue.

--bad
EXPLAIN ANALYZE select *
from test1
where
test1.user_id in (100, 101)
and exists (
SELECT * from test2
join test3 using (resume_id)
where
test2.user_id = test1.user_id
);

Setting enable_hashjoin to false pushes it back to a good plan again.

Could you show explain analyze of the good plan please?

If you getting plan like:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=3.78..91844.51 rows=2 width=4) (actual
time=0.634..0.750 rows=2 loops=1)
Join Filter: (test1.user_id = test2.user_id)
-> Bitmap Heap Scan on test1 (cost=2.54..4.57 rows=2 width=4) (actual
time=0.013..0.015 rows=2 loops=1)
Recheck Cond: (user_id = ANY ('{100,101}'::integer[]))
-> Bitmap Index Scan on test1_user_id_key (cost=0.00..2.54
rows=2 width=0) (actual time=0.009..0.009 rows=2 loops=1)
Index Cond: (user_id = ANY ('{100,101}'::integer[]))
-> Materialize (cost=1.23..64339.94 rows=1000000 width=4) (actual
time=0.018..0.305 rows=100 loops=2)
-> Merge Join (cost=1.23..59339.94 rows=1000000 width=4) (actual
time=0.032..0.446 rows=101 loops=1)
Merge Cond: (test2.resume_id = test3.resume_id)
-> Index Scan using test2_resume_id_key on test2
(cost=0.00..22170.28 rows=1000000 width=8) (actual time=0.009..0.076
rows=101 loops=1)
-> Index Scan using test3_resume_id_key on test3
(cost=0.00..22170.28 rows=1000000 width=4) (actual time=0.007..0.075
rows=101 loops=1)
Total runtime: 0.785 ms

Try use high values for the user_id :

EXPLAIN ANALYZE select *
from test1
where
test1.user_id in (90000, 900001)
and exists (
SELECT * from test2
join test3 using (resume_id)
where
test2.user_id = test1.user_id
);

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=3.78..91844.51 rows=2 width=4) (actual
time=432.266..4457.799 rows=2 loops=1)
Join Filter: (test1.user_id = test2.user_id)
-> Bitmap Heap Scan on test1 (cost=2.54..4.57 rows=2 width=4) (actual
time=0.057..0.063 rows=2 loops=1)
Recheck Cond: (user_id = ANY ('{90000,900001}'::integer[]))
-> Bitmap Index Scan on test1_user_id_key (cost=0.00..2.54
rows=2 width=0) (actual time=0.050..0.050 rows=2 loops=1)
Index Cond: (user_id = ANY ('{90000,900001}'::integer[]))
-> Materialize (cost=1.23..64339.94 rows=1000000 width=4) (actual
time=0.011..1942.046 rows=495000 loops=2)
-> Merge Join (cost=1.23..59339.94 rows=1000000 width=4) (actual
time=0.018..2805.842 rows=900001 loops=1)
Merge Cond: (test2.resume_id = test3.resume_id)
-> Index Scan using test2_resume_id_key on test2
(cost=0.00..22170.28 rows=1000000 width=8) (actual time=0.007..571.851
rows=900001 loops=1)
-> Index Scan using test3_resume_id_key on test3
(cost=0.00..22170.28 rows=1000000 width=4) (actual time=0.006..594.484
rows=900001 loops=1)
Total runtime: 4467.887 ms

--
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
If they can send one man to the moon... why can't they send them all?

МойКруг: http://mboguk.moikrug.ru/
Сила солому ломит, но не все в нашей жизни - солома, да и сила далеко не
все.

Tom Lane

tgl@sss.pgh.pa.us

over 14 years ago

In reply to: Maxim Boguk (#1)

Re: BUG #6335: Weird planner decision with exists (a join b) condition

maxim.boguk@gmail.com writes:

EXPLAIN analyze select *
from applicant_adv_subscription aas
where
aas.user_id in (5112699,7995496)
and exists (
SELECT * from resume
join resume_view_history using (resume_id)
where
resume.user_id = aas.user_id
);

I'm hoping to fix this type of case with the "generalized inner
indexscan" work that I've been nattering about for a year or two now.
What you need to make this fast, given that resume and
resume_view_history are both large, is to push the current value of
aas.user_id down into the table scan of resume --- and because the join
and semijoin can't be reordered, that's not possible with the planner's
current simpleminded idea of what an inner indexscan can be.

The other example you show manages to luck out and get a good plan due
to transitive propagation of equality conditions, but that's a narrow
special case. Any other form of constraint whatsoever on aas is going
to end up with the crummy plan where the whole lower join gets computed.

regards, tom lane

Maxim Boguk

maxim.boguk@gmail.com

over 14 years ago

In reply to: Tom Lane (#6)

Re: BUG #6335: Weird planner decision with exists (a join b) condition

On Thu, Dec 15, 2011 at 7:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

maxim.boguk@gmail.com writes:

EXPLAIN analyze select *
from applicant_adv_subscription aas
where
aas.user_id in (5112699,7995496)
and exists (
SELECT * from resume
join resume_view_history using (resume_id)
where
resume.user_id = aas.user_id
);

I'm hoping to fix this type of case with the "generalized inner
indexscan" work that I've been nattering about for a year or two now.
What you need to make this fast, given that resume and
resume_view_history are both large, is to push the current value of
aas.user_id down into the table scan of resume --- and because the join
and semijoin can't be reordered, that's not possible with the planner's
current simpleminded idea of what an inner indexscan can be.

The other example you show manages to luck out and get a good plan due
to transitive propagation of equality conditions, but that's a narrow
special case. Any other form of constraint whatsoever on aas is going
to end up with the crummy plan where the whole lower join gets computed.

regards, tom lane

Thank you very much for information.
Rewriting the query did the trick and resolved performance issues.

Do you plan create "generalized inner indexscan" mechanics for 9.2 version?

--
Maxim Boguk