Optimizer generates bad plans.

Started by Kris Jurkaover 23 years ago11 messages
#1Kris Jurka
jurka@ejurka.com
3 attachment(s)

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans. I have
attached the query and explain analyze outputs against today's cvs head
for queries that take between 9 and 845941 msec. In the JDBC Driver I
will specify a reasonable join order using explicit JOINs, but I thought
someone might be interested in a test case for the optimizer.

Kris Jurka

The query tries to determine what foreign keys exists between the
following tables.

create table people (id int4 primary key, name text);
create table policy (id int4 primary key, name text);
create table users (id int4 primary key, people_id int4,
policy_id int4,
CONSTRAINT people FOREIGN KEY (people_id) references people(id),
constraint policy FOREIGN KEY (policy_id) references policy(id));

Attachments:

query.sqltext/plain; name=query.sqlDownload
badplan.txttext/plain; name=badplan.txtDownload
goodplan.txttext/plain; name=goodplan.txtDownload
#2Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Kris Jurka (#1)
Re: Optimizer generates bad plans.

Congratulations. That is the largest plan I have ever seen. ;-)

---------------------------------------------------------------------------

Kris Jurka wrote:

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans. I have
attached the query and explain analyze outputs against today's cvs head
for queries that take between 9 and 845941 msec. In the JDBC Driver I
will specify a reasonable join order using explicit JOINs, but I thought
someone might be interested in a test case for the optimizer.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Kris Jurka
books@ejurka.com
In reply to: Bruce Momjian (#2)
Re: Optimizer generates bad plans.

Well I was really hoping pg_constraint would solve all my problems, but
since contrib/array is not installed by default the conkeys and confkeys
columns aren't terribly useful because they can't be joined to
pg_attribute.

Also there is not a column to tell you the unique constraint that
supports a given foreign key constraint.

See my post to bugs:

http://fts.postgresql.org/db/mw/msg.html?mid=1074855

Kris Jurka

On Thu, 19 Sep 2002, Bruce Momjian wrote:

Show quoted text

Congratulations. That is the largest plan I have ever seen. ;-)

---------------------------------------------------------------------------

Kris Jurka wrote:

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans. I have
attached the query and explain analyze outputs against today's cvs head
for queries that take between 9 and 845941 msec. In the JDBC Driver I
will specify a reasonable join order using explicit JOINs, but I thought
someone might be interested in a test case for the optimizer.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kris Jurka (#1)
Re: Optimizer generates bad plans.

Kris Jurka <jurka@ejurka.com> writes:

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans.

Hm, does an ANALYZE help?

regards, tom lane

#5Rod Taylor
rbt@rbt.ca
In reply to: Kris Jurka (#3)
Re: Optimizer generates bad plans.

Maybe not nice, but there's only 32 (64 now?) of them...

JOIN pg_attribute WHERE attnum IN (conkeys[1], conkeys[2], conkeys[3],
..., conkeys[32])

Great fun...

On Thu, 2002-09-19 at 18:31, Kris Jurka wrote:

Well I was really hoping pg_constraint would solve all my problems, but
since contrib/array is not installed by default the conkeys and confkeys
columns aren't terribly useful because they can't be joined to
pg_attribute.

Also there is not a column to tell you the unique constraint that
supports a given foreign key constraint.

See my post to bugs:

http://fts.postgresql.org/db/mw/msg.html?mid=1074855

Kris Jurka

On Thu, 19 Sep 2002, Bruce Momjian wrote:

Congratulations. That is the largest plan I have ever seen. ;-)

---------------------------------------------------------------------------

Kris Jurka wrote:

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans. I have
attached the query and explain analyze outputs against today's cvs head
for queries that take between 9 and 845941 msec. In the JDBC Driver I
will specify a reasonable join order using explicit JOINs, but I thought
someone might be interested in a test case for the optimizer.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Rod Taylor

#6Neil Conway
neilc@samurai.com
In reply to: Kris Jurka (#1)
Re: Optimizer generates bad plans.

Kris Jurka <jurka@ejurka.com> writes:

While adding schema support to the JDBC Driver, I came across a
query which occasionally generates some spectacularly bad plans.

Interesting. The inconsistency you're seeing is a result of GEQO. I
would have hoped that it would have produced a better quality plan
more often, but apparently not. On my system, the regular query
optimizer handily beats GEQO for this query: it produces more
efficienty query plans 100% of the time and takes less time to do so.

For *this* query at least, raising geqo_threshold would be a good
idea, but that may not be true universally.

I thought someone might be interested in a test case for the
optimizer.

Thanks, it's a useful query -- I've been meaning to take a look at
GEQO for a while now...

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

#7Kris Jurka
books@ejurka.com
In reply to: Tom Lane (#4)
Re: Optimizer generates bad plans.

On Thu, 19 Sep 2002, Tom Lane wrote:

Kris Jurka <jurka@ejurka.com> writes:

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans.

Hm, does an ANALYZE help?

Yes, it does, but I don't understand why. The query is entirely against
pg_catalog tables which have had all of three tables added to them. How
can the new ANALYZE stats be significantly different than what came from
the ANALYZED template1.

Kris Jurka

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Conway (#6)
Re: Optimizer generates bad plans.

Neil Conway <neilc@samurai.com> writes:

Interesting. The inconsistency you're seeing is a result of GEQO. I
would have hoped that it would have produced a better quality plan
more often, but apparently not. On my system, the regular query
optimizer handily beats GEQO for this query: it produces more
efficienty query plans 100% of the time and takes less time to do so.
For *this* query at least, raising geqo_threshold would be a good
idea, but that may not be true universally.

The current GEQO threshold was set some time ago; since then, the
regular optimizer has been improved while the GEQO code hasn't been
touched. It might well be time to ratchet up the threshold.

Anyone care to do some additional experiments?

regards, tom lane

#9Kris Jurka
books@ejurka.com
In reply to: Kris Jurka (#7)
Re: Optimizer generates bad plans.

On Thu, 19 Sep 2002, Kris Jurka wrote:

On Thu, 19 Sep 2002, Tom Lane wrote:

Kris Jurka <jurka@ejurka.com> writes:

While adding schema support to the JDBC Driver, I came across a query
which occasionally generates some spectacularly bad plans.

Hm, does an ANALYZE help?

Yes, it does, but I don't understand why. The query is entirely against
pg_catalog tables which have had all of three tables added to them. How
can the new ANALYZE stats be significantly different than what came from
the ANALYZED template1.

Kris Jurka

Looking at the differences in statistics before and after the ANALYZE the
only differences are in correlation. This comes from initdb around line
1046...

"$PGPATH"/postgres $PGSQL_OPT template1 >/dev/null <<EOF
ANALYZE;
VACUUM FULL FREEZE;
EOF

Could this be done better in the one step VACUUM FULL FREEZE ANALYZE or
ANALYZING after the VACUUM FULL?

Kris Jurka

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kris Jurka (#9)
Re: Optimizer generates bad plans.

Kris Jurka <books@ejurka.com> writes:

Looking at the differences in statistics before and after the ANALYZE the
only differences are in correlation. This comes from initdb around line
1046...

"$PGPATH"/postgres $PGSQL_OPT template1 >/dev/null <<EOF
ANALYZE;
VACUUM FULL FREEZE;
EOF

Could this be done better in the one step VACUUM FULL FREEZE ANALYZE or
ANALYZING after the VACUUM FULL?

Hm. We can't do it like that, because that would leave the pg_statistic
rows unfrozen. I suppose we could do

VACUUM FULL;
ANALYZE;
VACUUM FREEZE;

regards, tom lane

#11Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#8)
Re: Optimizer generates bad plans.

Tom Lane wrote:

Neil Conway <neilc@samurai.com> writes:

Interesting. The inconsistency you're seeing is a result of GEQO. I
would have hoped that it would have produced a better quality plan
more often, but apparently not. On my system, the regular query
optimizer handily beats GEQO for this query: it produces more
efficienty query plans 100% of the time and takes less time to do so.
For *this* query at least, raising geqo_threshold would be a good
idea, but that may not be true universally.

The current GEQO threshold was set some time ago; since then, the
regular optimizer has been improved while the GEQO code hasn't been
touched. It might well be time to ratchet up the threshold.

Anyone care to do some additional experiments?

Added to TODO:

* Check GUC geqo_threshold to see if it is still accurate

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073