Turning the PLANNER off

Started by Ben McMahanabout 23 years ago7 messages
#1Ben McMahan
mcmahanb@cs.rice.edu

I'm looking at different ways of optimizing queries with a large number of
joins. I write the same query in a number of different ways and compare
the running times. Now the problem is I do not want the optimizer
changing the queries. So I explicit state the order of the joins in the
FROM clause. I also turn off everything I can except for one type of join
(say hash join), and I've turned off geqo. But I find that the PLANNER
still takes an enormous amount of time for some queries. It doesn't look
like the Planner is actually optimizing (changing) anything, but just in
case, I was wondering if there was a way to turn off the PLANNER.

Note, when I say an enormous amount of time, I mean at least double the
time the EXECUTOR takes to actually answer the query.

Thanks for your help,

Ben McMahan

ps. here is a small example of what my queries look like (so you can see
if there is something else it might be deciding on):

SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5
FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN (r1 c1
(x4,x1,x2) JOIN r1 c0 (x1,x3,x4)
ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 ))
ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 ))
ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 ))
ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 );

A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4
where it also renames the columns to x4, x2, and x5 respectively.

#2Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Ben McMahan (#1)
Re: Turning the PLANNER off

That is a good question. The planner does more than just analyse the
query. It generates the Plan used by the executor, so that can't be
removed.

It is always a pain when the optimizer/planner takes longer than the
executor. We do have PREPARE/EXECUTE in 7.3beta for you to use.

---------------------------------------------------------------------------

Ben McMahan wrote:

I'm looking at different ways of optimizing queries with a large number of
joins. I write the same query in a number of different ways and compare
the running times. Now the problem is I do not want the optimizer
changing the queries. So I explicit state the order of the joins in the
FROM clause. I also turn off everything I can except for one type of join
(say hash join), and I've turned off geqo. But I find that the PLANNER
still takes an enormous amount of time for some queries. It doesn't look
like the Planner is actually optimizing (changing) anything, but just in
case, I was wondering if there was a way to turn off the PLANNER.

Note, when I say an enormous amount of time, I mean at least double the
time the EXECUTOR takes to actually answer the query.

Thanks for your help,

Ben McMahan

ps. here is a small example of what my queries look like (so you can see
if there is something else it might be deciding on):

SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5
FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN (r1 c1
(x4,x1,x2) JOIN r1 c0 (x1,x3,x4)
ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 ))
ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 ))
ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 ))
ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 );

A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4
where it also renames the columns to x4, x2, and x5 respectively.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3David Walker
pgsql@grax.com
In reply to: Bruce Momjian (#2)
Re: Turning the PLANNER off

Does PREPARE turn
select * from mytable
into
select mytable.field1,mytable.field2 from mynamespace.mytable
?

I was looking for this functionality for one of my projects so I'm curious.

Show quoted text

On Monday 28 October 2002 06:55 pm, (Via wrote:

That is a good question. The planner does more than just analyse the
query. It generates the Plan used by the executor, so that can't be
removed.

It is always a pain when the optimizer/planner takes longer than the
executor. We do have PREPARE/EXECUTE in 7.3beta for you to use.

---------------------------------------------------------------------------

Ben McMahan wrote:

I'm looking at different ways of optimizing queries with a large number
of joins. I write the same query in a number of different ways and
compare the running times. Now the problem is I do not want the
optimizer changing the queries. So I explicit state the order of the
joins in the FROM clause. I also turn off everything I can except for
one type of join (say hash join), and I've turned off geqo. But I find
that the PLANNER still takes an enormous amount of time for some queries.
It doesn't look like the Planner is actually optimizing (changing)
anything, but just in case, I was wondering if there was a way to turn
off the PLANNER.

Note, when I say an enormous amount of time, I mean at least double the
time the EXECUTOR takes to actually answer the query.

Thanks for your help,

Ben McMahan

ps. here is a small example of what my queries look like (so you can see
if there is something else it might be deciding on):

SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5
FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN
(r1 c1 (x4,x1,x2) JOIN r1 c0 (x1,x3,x4)
ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 ))
ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 ))
ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 ))
ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 );

A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4
where it also renames the columns to x4, x2, and x5 respectively.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: David Walker (#3)
Re: Turning the PLANNER off

David Walker wrote:

Does PREPARE turn
select * from mytable
into
select mytable.field1,mytable.field2 from mynamespace.mytable
?

I was looking for this functionality for one of my projects so I'm curious.

PREPARE turns the query into an internal format used by the executor.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Ross J. Reedstrom
reedstrm@rice.edu
In reply to: Bruce Momjian (#2)
Re: Turning the PLANNER off

Ah, so Ben finally got around to posting here. Ben's a CS Grad student
here at Rice. His (current) project involves taking some interesting
results from constraint satisfaction and implementing them on a database:
one of the CS faculty has demonstrated that one class of highly joined
DB queries maps to a solved problem in constraint satisfaction. The end
goal would be an optimizer module or setting that recognizes this class
of query, and spits out a mathematically optimized join order.

So, in the interim, Ben's trying to do it by hand: preorder the joins
and demonstrate that the 'best' order is in fact the best. Then move on
to looking into integrating this, if possible: part of the problem is
recognizing the structure of the query, of course. Right now, the planner
is getting in the way - although he can extract the needed timing info,
he's wasting CPU cycles planning things that don't need it, limiting
the number of cases he can try.

As mentioned before, there's currently no interface to feed in a Plan,
so he's out of luck. Is there, programmatically, a way to do it?
Serialize a plan tree to a file, and feed it in latter, purely for
development purposes. How painful would that be? Should I send him in to
see if he can implement one quickly, or are there dragons hiding in there?

Ross

Show quoted text

On Mon, Oct 28, 2002 at 07:55:02PM -0500, Bruce Momjian wrote:

That is a good question. The planner does more than just analyse the
query. It generates the Plan used by the executor, so that can't be
removed.

It is always a pain when the optimizer/planner takes longer than the
executor. We do have PREPARE/EXECUTE in 7.3beta for you to use.

---------------------------------------------------------------------------

Ben McMahan wrote:

I'm looking at different ways of optimizing queries with a large number of
joins. I write the same query in a number of different ways and compare
the running times. Now the problem is I do not want the optimizer
changing the queries. So I explicit state the order of the joins in the
FROM clause. I also turn off everything I can except for one type of join
(say hash join), and I've turned off geqo. But I find that the PLANNER
still takes an enormous amount of time for some queries. It doesn't look
like the Planner is actually optimizing (changing) anything, but just in
case, I was wondering if there was a way to turn off the PLANNER.

Note, when I say an enormous amount of time, I mean at least double the
time the EXECUTOR takes to actually answer the query.

Thanks for your help,

Ben McMahan

ps. here is a small example of what my queries look like (so you can see
if there is something else it might be deciding on):

SELECT DISTINCT c0.x1 , c1.x2 , c0.x3 , c0.x4 , c2.x5
FROM r1 c4 (x4,x2,x5) JOIN (r0 c3 (x2,x3,x5) JOIN (r2 c2 (x3,x1,x5) JOIN (r1 c1
(x4,x1,x2) JOIN r1 c0 (x1,x3,x4)
ON ( c0.x4 = c1.x4 AND c0.x1 = c1.x1 ))
ON ( c0.x3 = c2.x3 AND c0.x1 = c2.x1 ))
ON ( c1.x2 = c3.x2 AND c0.x3 = c3.x3 AND c2.x5 = c3.x5 ))
ON ( c0.x4 = c4.x4 AND c1.x2 = c4.x2 AND c2.x5 = c4.x5 );

A quick reminder, FROM r1 c4 (x4,x2,x5) just renames a table r1 into c4
where it also renames the columns to x4, x2, and x5 respectively.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ross J. Reedstrom (#5)
Re: Turning the PLANNER off

"Ross J. Reedstrom" <reedstrm@rice.edu> writes:

So, in the interim, Ben's trying to do it by hand: preorder the joins
and demonstrate that the 'best' order is in fact the best. Then move on
to looking into integrating this, if possible: part of the problem is
recognizing the structure of the query, of course. Right now, the planner
is getting in the way - although he can extract the needed timing info,
he's wasting CPU cycles planning things that don't need it, limiting
the number of cases he can try.

What's the basis for your assertion that it's "planning things that
don't need it"? Given a JOIN-constrained query I do not believe the
planner will look at any cases other than the intended join order.

As mentioned before, there's currently no interface to feed in a Plan,
so he's out of luck. Is there, programmatically, a way to do it?
Serialize a plan tree to a file, and feed it in latter, purely for
development purposes. How painful would that be? Should I send him in to
see if he can implement one quickly, or are there dragons hiding in there?

He can do whatever he wants, as long as he has no illusions about
getting it accepted back into the sources ;-).

What would probably be more useful is to do some profiling to understand
why the planner is taking longer than he wants even with a
JOIN-constrained query. I should think this would be pretty quick.

regards, tom lane

#7Ross J. Reedstrom
reedstrm@rice.edu
In reply to: Tom Lane (#6)
Re: Turning the PLANNER off

On Wed, Oct 30, 2002 at 10:59:39PM -0500, Tom Lane wrote:

What's the basis for your assertion that it's "planning things that
don't need it"? Given a JOIN-constrained query I do not believe the
planner will look at any cases other than the intended join order.

Well, that was a loose choice of words - let's say the planner seems to
be taking awfully long to build an execution tree with only one choice
available.

He can do whatever he wants, as long as he has no illusions about
getting it accepted back into the sources ;-).

Understood - this would be a hacking tool only.

What would probably be more useful is to do some profiling to understand
why the planner is taking longer than he wants even with a
JOIN-constrained query. I should think this would be pretty quick.

Yup, that targets the same question as above - anything 'uneeded' actually
happening in the planner? I'll send him off with this suggestion.

Ross