*_collapse_limit, geqo_threshold

Started by Robert Haasover 16 years ago75 messages

robertmhaas@gmail.com

over 16 years ago

1 attachment(s)

I think we should try to do something about join_collapse_limit,
from_collapse_limit, and geqo_threshold for 8.5.

http://archives.postgresql.org/message-id/9134.1243289706@sss.pgh.pa.us
http://archives.postgresql.org/message-id/603c8f070905251800g5b86d2dav26eca7f417d15dbf@mail.gmail.com

I'm still of the opinion that join_collapse_threshold is a loaded
foot-gun, because I don't think that users will expect that a join
specified this way:

SELECT ... FROM a JOIN b ON Pab JOIN c ON Pac JOIN d ON Pad ...

will behave differently than one specified this way:

SELECT ... FROM a, b, c, d WHERE Pab AND Pac AND Pad ...

The whole purpose of join_collapse_limit in the first instance is to
prevent planning time from getting out of control, but I don't see how
we can view it as a very effective safety valve when it depends so
heavily on which syntax is used. If the planning time for an N-way
join is excessive, then we're going to have a problem with excessive
planning time whenever the second syntax is selected, and I don't see
any reason to believe that users see the second syntax as "dangerous"
in terms of planning time but the first syntax as "safer".

One possibility would be to remove join_collapse_limit entirely, but
that would eliminate one possibily-useful piece of functionality that
it current enables: namely, the ability to exactly specify the join
order by setting join_collapse_limit to 1. So one possibility would
be to rename the variable something like explicit_join_order and make
it a Boolean; another possibility would be to change the default value
to INT_MAX.

The approach I've taken in the attached patch is to make 0 mean
"unlimited" and make that the default value. I don't have a strong
feeling about whether that's better than the other two options,
although it seems cleaner to me or I'd not have written the patch that
way. We could also consider adopting this same approach for
from_collapse_limit, though for some reason that behavior marginally
less pathological to me.

At any rate, regardless of whether this patch (or one of the other
approaches mentioned above) are adopted for 8.5, I think we should
raise the default values for whatever is left. The defaults basically
haven't been modified since they were put in, and my experience is
that even queries with 10 to 15 joins perform acceptably for OLTP
workloads, which are exactly the workloads where query planning time
is most likely to be an issue. So I would propose raising each of the
limits by 4 (to 12 for from_collapse_limit and join_collapse_limit if
we don't unlimit them entirely, and to 16 for geqo_threshold). I'm
interested in hearing from anyone who has practical experience with
tuning these variables, or any ideas on what we should test to get a
better idea as to how to set them.

Thanks,

...Robert

Attachments:

unlimit_join_collapse.patchtext/x-diff; charset=US-ASCII; name=unlimit_join_collapse.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 2288,2296 **** SELECT * FROM parent WHERE key = 2400;
         </para>
  
         <para>
!         By default, this variable is set the same as
!         <varname>from_collapse_limit</varname>, which is appropriate
!         for most uses. Setting it to 1 prevents any reordering of
          explicit <literal>JOIN</>s. Thus, the explicit join order
          specified in the query will be the actual order in which the
          relations are joined. The query planner does not always choose
--- 2288,2295 ----
         </para>
  
         <para>
!         By default, this variable is set to <literal>0</>, which always
!         allows rewriting.  Setting it to 1 prevents any reordering of
          explicit <literal>JOIN</>s. Thus, the explicit join order
          specified in the query will be the actual order in which the
          relations are joined. The query planner does not always choose
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***************
*** 477,483 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (list_length(leftjoinlist) + list_length(rightjoinlist) <=
  				 join_collapse_limit)
  		{
  			/* OK to combine subproblems */
--- 477,484 ----
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (join_collapse_limit == 0
! 				 || list_length(leftjoinlist) + list_length(rightjoinlist) <=
  				 join_collapse_limit)
  		{
  			/* OK to combine subproblems */
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 1275,1284 **** static struct config_int ConfigureNamesInt[] =
  						 "constructs are not flattened."),
  			gettext_noop("The planner will flatten explicit JOIN "
  						 "constructs into lists of FROM items whenever a "
! 						 "list of no more than this many items would result.")
  		},
  		&join_collapse_limit,
! 		8, 1, INT_MAX, NULL, NULL
  	},
  	{
  		{"geqo_threshold", PGC_USERSET, QUERY_TUNING_GEQO,
--- 1275,1285 ----
  						 "constructs are not flattened."),
  			gettext_noop("The planner will flatten explicit JOIN "
  						 "constructs into lists of FROM items whenever a "
! 						 "list of no more than this many items would result. "
! 						 "Zero indicates no limit.")
  		},
  		&join_collapse_limit,
! 		0, 0, INT_MAX, NULL, NULL
  	},
  	{
  		{"geqo_threshold", PGC_USERSET, QUERY_TUNING_GEQO,
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 220,227 ****
  #constraint_exclusion = partition	# on, off, or partition
  #cursor_tuple_fraction = 0.1		# range 0.0-1.0
  #from_collapse_limit = 8
! #join_collapse_limit = 8		# 1 disables collapsing of explicit 
! 					# JOIN clauses
  
  
  #------------------------------------------------------------------------------
--- 220,229 ----
  #constraint_exclusion = partition	# on, off, or partition
  #cursor_tuple_fraction = 0.1		# range 0.0-1.0
  #from_collapse_limit = 8
! #join_collapse_limit = 0		# 1 disables collapsing of explicit 
! 					# JOIN clauses, 0 always allows collapsing,
! 					# >1 allows collapsing of lists when FROM clause will
! 					# have <= join_collapse_limit items
  
  
  #------------------------------------------------------------------------------

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Robert Haas (#1)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> wrote:

I'm interested in hearing from anyone who has practical experience
with tuning these variables, or any ideas on what we should test to
get a better idea as to how to set them.

I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing. Has anyone else ever seen such behavior? Can we get
examples? (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)

My own experience is that when we investigate a complaint about a
query not performing to user or application programmer expectations,
we have sometimes found that boosting these values has helped. We
boost them overall (in postgresql.conf) without ever having seen a
downside. We currently have geqo disabled and set both collapse
limits to 20. We should probably just set them both to several
hundred and not wait until some query with more than 20 tables
performs badly, but I'm not sure we have any of those yet.

In short, my experience is that when setting these higher has made any
difference at all, it has always generated a plan that saved more time
than the extra planning required. Well, I'd bet that there has been
an increase in the plan time of some queries which wound up with the
same plan anyway, but the difference has never been noticeable; the
net
effect has been a plus for us.

I guess the question is whether there is anyone who has had a contrary
experience. (There must have been some benchmarks to justify adding
geqo at some point?)

-Kevin

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Kevin Grittner (#2)

Re: *_collapse_limit, geqo_threshold

On Jul 7, 2009, at 9:31 AM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov

wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

I'm interested in hearing from anyone who has practical experience
with tuning these variables, or any ideas on what we should test to
get a better idea as to how to set them.

I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing. Has anyone else ever seen such behavior? Can we get
examples? (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)

Well, there's not really enough information there to figure out
specifically what was happening, but from 10,000 feet,
join_collapse_limit and from_collapse_limit constrain the join order.
If the estimates are all accurate, setting them to a value < infinity
will either leave the plans unchanged or make them worse. If it's
making them better, then the estimates are off and the join order
constraint happens to be preventing the planner from considering the
cases what really hurts you. But that's mostly luck.

My own experience is that when we investigate a complaint about a
query not performing to user or application programmer expectations,
we have sometimes found that boosting these values has helped. We
boost them overall (in postgresql.conf) without ever having seen a
downside. We currently have geqo disabled and set both collapse
limits to 20. We should probably just set them both to several
hundred and not wait until some query with more than 20 tables
performs badly, but I'm not sure we have any of those yet.

In short, my experience is that when setting these higher has made any
difference at all, it has always generated a plan that saved more time
than the extra planning required. Well, I'd bet that there has been
an increase in the plan time of some queries which wound up with the
same plan anyway, but the difference has never been noticeable; the
net
effect has been a plus for us.

You have a big dataset AIUI so the right values for you might be too
high for some people with, say, OLTP workloads.

I guess the question is whether there is anyone who has had a contrary
experience. (There must have been some benchmarks to justify adding
geqo at some point?)

GEQO or something like it is certainly needed for very large planning
problems. The non-GEQO planner takes exponential time in the size of
the problem, so at some point that's going to get ugly. But
triggering it at the level we do now seems unnecessarily pessimistic
about what constitutes too much planning.

...Robert

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Kevin Grittner (#2)

Re: *_collapse_limit, geqo_threshold

Hi Kevin, Hi all,

On Tuesday 07 July 2009 16:31:14 Kevin Grittner wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

I'm interested in hearing from anyone who has practical experience
with tuning these variables, or any ideas on what we should test to
get a better idea as to how to set them.

I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing. Has anyone else ever seen such behavior? Can we get
examples? (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)

I don't think it is surprising that small changes on those variables change
the plan time widely on a complex query.
I.e. a increase by one in from_collapse_limit can completely change the plan
before optimizations change due to more inlining.

I don't know the exact behaviour in the case more joins exists than
join_collapse_limit but is not hard to imagine that this also can dramatically
change the plan complexity. As there were quite many different views involved
all the changes on the *_limit variables could have triggered plan changes in
different parts of the query.

I plan to revisit the issue you referenced btw. Only first was release phase
and then I could not motivate myself to investigate a bit more...

The mail you referenced contains a completely bogus and ugly query that shows
similar symptoms by the way. I guess the variations would be even bigger if
differently sized views/subqueries would be used.

My own experience is that when we investigate a complaint about a
query not performing to user or application programmer expectations,
we have sometimes found that boosting these values has helped. We
boost them overall (in postgresql.conf) without ever having seen a
downside. We currently have geqo disabled and set both collapse
limits to 20. We should probably just set them both to several
hundred and not wait until some query with more than 20 tables
performs badly, but I'm not sure we have any of those yet.

I have not found consistently better results with geqo enabled. Some queries
are better, others worse. Often the comparison is not reliably reproducable.
(The possibility to set geqo to some "know" starting value would be nice for
such comparisons)

I cannot reasonably plan some queries with join_collapse_limit set to 20. At
least not without setting the geqo limit very low and a geqo_effort to a low
value.
So I would definitely not agree that removing j_c_l is a good idea.

Andres

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Andres Freund (#4)

Re: *_collapse_limit, geqo_threshold

Andres Freund <andres@anarazel.de> writes:

I cannot reasonably plan some queries with join_collapse_limit set to 20. At
least not without setting the geqo limit very low and a geqo_effort to a low
value.
So I would definitely not agree that removing j_c_l is a good idea.

Can you show some specific examples? All of this discussion seems like
speculation in a vacuum ...

regards, tom lane

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#5)

Re: *_collapse_limit, geqo_threshold

On Tuesday 07 July 2009 17:40:50 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I cannot reasonably plan some queries with join_collapse_limit set to 20.
At least not without setting the geqo limit very low and a geqo_effort to
a low value.
So I would definitely not agree that removing j_c_l is a good idea.

Can you show some specific examples? All of this discussion seems like
speculation in a vacuum ...

I still may not publish the original schema (And I still have not heard any
reasonable reasons) - the crazy query in the referenced email shows similar
problems and has a somewhat similar structure.

If that is not enough I will try to design a schema that is similar and
different enough from the original schema. Will take a day or two though.

Andres

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Kevin Grittner (#2)

Re: *_collapse_limit, geqo_threshold

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

I guess the question is whether there is anyone who has had a contrary
experience. (There must have been some benchmarks to justify adding
geqo at some point?)

The CVS history shows that geqo was integrated on 1997-02-19, which
I think means that it must have been developed against Postgres95
(or even earlier Berkeley releases?). That was certainly before any
of the current community's work on the optimizer began. A quick look
at the code as it stood on that date suggests that the regular
optimizer's behavior for large numbers of rels was a lot worse than it
is today --- notably, it looks like it would consider a whole lot more
Cartesian-product joins than we do now; especially if you had "bushy"
mode turned on, which you'd probably have to do to find good plans in
complicated cases. There were also a bunch of enormous inefficiencies
that we've whittled down over time, such as the mechanisms for comparing
pathkeys or the use of integer Lists to represent relid sets.

So while I don't doubt that geqo was absolutely essential when it was
written, it's fair to question whether it still provides a real win.
And we could definitely stand to take another look at the default
thresholds.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Robert Haas (#1)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> writes:

One possibility would be to remove join_collapse_limit entirely, but
that would eliminate one possibily-useful piece of functionality that
it current enables: namely, the ability to exactly specify the join
order by setting join_collapse_limit to 1. So one possibility would
be to rename the variable something like explicit_join_order and make
it a Boolean; another possibility would be to change the default value
to INT_MAX.

As the person who put in those thresholds, I kind of prefer going over
to the boolean definition. That was the alternative that we considered;
the numeric thresholds were used instead because they were easy to
implement and seemed to possibly offer more control. But I'm not
convinced that anyone has really used them profitably. I agree that
the ability to use JOIN syntax to specify the join order exactly (with
join_collapse_limit=1) is the only really solid use-case anyone has
proposed for either threshold. I'm interested in Andreas' comment that
he has use-cases where using the collapse_limit is better than allowing
geqo to take over for very large problems ... but I think we need to see
those use-cases and see if there's a better fix.

regards, tom lane

Greg Stark

gsstark@mit.edu

over 16 years ago

In reply to: Tom Lane (#7)

Re: *_collapse_limit, geqo_threshold

On Tue, Jul 7, 2009 at 5:58 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

So while I don't doubt that geqo was absolutely essential when it was
written, it's fair to question whether it still provides a real win.
And we could definitely stand to take another look at the default
thresholds

The whole point of these parameters is to save time planning large
complex queries -- which are rarely going to be the kind of short,
simple, fast to execute oltp queries where planning time makes a big
difference. The larger more complex the query the more likely it is to
be a long-running dss or olap style query where shaving one percent
off the runtime would be worth spending many seconds planning.

I propose that there's a maximum reasonable planning time which a
programmer woulod normally expect the database to be able to come up
with a plan for virtually any query within. Personally I would be
surprised if a plain EXPLAIN took more than, say, 30s. perhaps even
something more like 10s.

We should benchmark the planner on increasingly large sets of
relations on a typical developer machine and set geqo to whatever
value the planner can handle in that length of time. I suspect even at
10s you're talking about substantially larger values than the current
default.

--
greg
http://mit.edu/~gsstark/resume.pdf

#10

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Greg Stark (#9)

Re: *_collapse_limit, geqo_threshold

Greg Stark <gsstark@mit.edu> writes:

We should benchmark the planner on increasingly large sets of
relations on a typical developer machine and set geqo to whatever
value the planner can handle in that length of time. I suspect even at
10s you're talking about substantially larger values than the current
default.

The problem is to find some realistic "benchmark" cases. That's one
reason why I was pestering Andreas to see his actual use cases ...

regards, tom lane

#11

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#10)

Re: *_collapse_limit, geqo_threshold

On Tuesday 07 July 2009 19:45:44 Tom Lane wrote:

Greg Stark <gsstark@mit.edu> writes:

We should benchmark the planner on increasingly large sets of
relations on a typical developer machine and set geqo to whatever
value the planner can handle in that length of time. I suspect even at
10s you're talking about substantially larger values than the current
default.

The problem is to find some realistic "benchmark" cases. That's one
reason why I was pestering Andreas to see his actual use cases ...

I will start writing a reduced/altered schema tomorrow then...

Andres

PS: Its "Andres" btw ;-)

#12

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#8)

Re: *_collapse_limit, geqo_threshold

On Jul 7, 2009, at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

One possibility would be to remove join_collapse_limit entirely, but
that would eliminate one possibily-useful piece of functionality that
it current enables: namely, the ability to exactly specify the join
order by setting join_collapse_limit to 1. So one possibility would
be to rename the variable something like explicit_join_order and make
it a Boolean; another possibility would be to change the default
value
to INT_MAX.

As the person who put in those thresholds, I kind of prefer going over
to the boolean definition.

I'm OK with that, but out of conservatism suggested changing the
default to unlimited in this release. If by chance there is something
we're missing and these parameters are doing someone any good, we can
suggest that they set them back to the old values rather than telling
them to use a private build. If on the other hand we don't get any
complaints, we can remove them with greater confidence in a future
release. But maybe that's too conservative.

Now, here's another thought: if we think it's reasonable for people to
want to explicitly specify the join order, a GUC isn't really the best
fit, because it's all or nothing. Maybe we'd be better off dropping
the GUCs entirely and adding some other bit of syntax that forces the
join order, but only for that particular join.

That was the alternative that we considered;
the numeric thresholds were used instead because they were easy to
implement and seemed to possibly offer more control. But I'm not
convinced that anyone has really used them profitably. I agree that
the ability to use JOIN syntax to specify the join order exactly (with
join_collapse_limit=1) is the only really solid use-case anyone has
proposed for either threshold. I'm interested in Andreas' comment
that
he has use-cases where using the collapse_limit is better than
allowing
geqo to take over for very large problems ... but I think we need to
see
those use-cases and see if there's a better fix.

regards, tom lane

Agreed.

...Robert

#13

Dimitri Fontaine

dfontaine@hi-media.com

over 16 years ago

In reply to: Greg Stark (#9)

Re: *_collapse_limit, geqo_threshold

Le 7 juil. 09 à 19:37, Greg Stark a écrit :

I propose that there's a maximum reasonable planning time

It sounds so much like the planner_effort GUC that has been talked
about in the past...
http://archives.postgresql.org/pgsql-performance/2009-05/msg00137.php

...except this time you want to measure it in seconds. The problem
with measuring it in seconds is that when the time has elapsed, it's
uneasy to switch from classic to geqo and avoid beginning from scratch
again.
Would it be possible to start geqo from current planner state?

Another idea would be to have more complex metrics for deciding when
to run geqo, that is guesstimate the query planning difficulty very
quickly, based on more than just the number of relations in the from:
presence of subqueries, UNION, EXISTS, IN, or branches in where
clause, number of operators and index support for them, maybe some
information from the stats too... The idea would be to
- set an effort threshold from where we'd better run geqo (GUC,
disabling possible)
- if threshold enabled, compute metrics
- if metric >= threshold, use geqo, if not, classic planner
- maybe default to disabling the threshold

It seems it'd be easier to set the new GUC on a per query basis...

The obvious problem to this approach is that computing the metric will
take some time better spent at planning queries, but maybe we could
have fast path for easy queries, which will look a lot like $subject.

Regards,
--
dim

I hope this will give readers better ideas than its bare content...

#14

Dimitri Fontaine

dfontaine@hi-media.com

over 16 years ago

In reply to: Robert Haas (#12)

Re: *_collapse_limit, geqo_threshold

Le 7 juil. 09 à 21:16, Robert Haas a écrit :

Now, here's another thought: if we think it's reasonable for people
to want to explicitly specify the join order, a GUC isn't really the
best fit, because it's all or nothing. Maybe we'd be better off
dropping the GUCs entirely and adding some other bit of syntax that
forces the join order, but only for that particular join.

MySQL calls them Straight Joins:
http://www.mysqlperformanceblog.com/2006/12/28/mysql-session-variables-and-hints/

I'm not sure our best move here would be in this direction :)
--
dim

#15

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Dimitri Fontaine (#13)

Re: *_collapse_limit, geqo_threshold

Dimitri Fontaine <dfontaine@hi-media.com> writes:

Another idea would be to have more complex metrics for deciding when
to run geqo, that is guesstimate the query planning difficulty very
quickly, based on more than just the number of relations in the from:
presence of subqueries, UNION, EXISTS, IN, or branches in where
clause, number of operators and index support for them, maybe some
information from the stats too...

Pointless, since GEQO is only concerned with examining alternative join
orderings. I see no reason whatever to think that number-of-relations
isn't the correct variable to test to decide whether to use it.

regards, tom lane

#16

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Dimitri Fontaine (#14)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> wrote:

if we think it's reasonable for people to want to explicitly specify
the join order

Regardless of the syntax (GUC or otherwise), that is an optimizer
hint. I thought we were trying to avoid those.

Although -- we do have all those enable_* GUC values which are also
optimizer hints; perhaps this should be another of those?
enable_join_reorder?

-Kevin

#17

Dimitri Fontaine

dfontaine@hi-media.com

over 16 years ago

In reply to: Tom Lane (#15)

Re: *_collapse_limit, geqo_threshold

Le 7 juil. 09 à 21:45, Tom Lane a écrit :

Dimitri Fontaine <dfontaine@hi-media.com> writes:

Another idea would be to have more complex metrics for deciding when
to run geqo

Pointless, since GEQO is only concerned with examining alternative
join
orderings. I see no reason whatever to think that number-of-relations
isn't the correct variable to test to decide whether to use it.

Oh. It seems I prefer showing my ignorance rather than learning enough
first. Writing mails is so much easier...

Sorry for the noise,
--
dim

#18

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Kevin Grittner (#16)

Re: *_collapse_limit, geqo_threshold

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Although -- we do have all those enable_* GUC values which are also
optimizer hints; perhaps this should be another of those?
enable_join_reorder?

Not a bad suggestion, especially since turning it off would usually be
considered just about as bad an idea as turning off the other ones.

regards, tom lane

#19

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Kevin Grittner (#16)

Re: *_collapse_limit, geqo_threshold

On Jul 7, 2009, at 3:03 PM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov

wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

if we think it's reasonable for people to want to explicitly specify
the join order

Regardless of the syntax (GUC or otherwise), that is an optimizer
hint. I thought we were trying to avoid those.

I guess my point is that there's not a lot of obvious benefit in
allowing the functionality to exist but handicapping it so that it's
useful in as few cases as possible. If the consensus is that we want
half a feature (but not more or less than half), that's OK with me,
but it's not obvious to me why we should choose to want that.

...Robert

#20

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Robert Haas (#19)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> writes:

I guess my point is that there's not a lot of obvious benefit in
allowing the functionality to exist but handicapping it so that it's
useful in as few cases as possible. If the consensus is that we want
half a feature (but not more or less than half), that's OK with me,
but it's not obvious to me why we should choose to want that.

Well, the question to my mind is whether the collapse_threshold GUCs in
their current form actually represent a feature ;-). They were put
in pretty much entirely on speculation that someone might find them
useful. Your argument is that they are not only useless but a foot-gun,
and so far we haven't got any clear contrary evidence. If we accept
that argument then we should take them out, not just change the default.

My own thought is that from_collapse_limit has more justification,
since it basically acts to stop a subquery from being flattened when
that would make the parent query too complex, and that seems like a
more understandable and justifiable behavior than treating JOIN
syntax specially. But I'm fine with removing join_collapse_limit
or reducing it to a boolean.

regards, tom lane

#21

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Robert Haas (#19)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

if we think it's reasonable for people to want to explicitly
specify the join order

Regardless of the syntax (GUC or otherwise), that is an optimizer
hint. I thought we were trying to avoid those.

I guess my point is that there's not a lot of obvious benefit in
allowing the functionality to exist but handicapping it so that it's
useful in as few cases as possible. If the consensus is that we
want half a feature (but not more or less than half), that's OK with
me, but it's not obvious to me why we should choose to want that.

Ever since I've been following these lists, there have been a pretty
steady dribble of requests for optimizer hints of one type or another.
The standard response has always been, "If you have a query which is
optimizing poorly, show us, and we'll try to fix the optimizer so that
it doesn't need hints to do a good job." The enable_* GUCs
effectively *are* optimizer hints, but they are strongly discouraged
for anything but diagnostic purposes. I guess I don't see any reason
to consider this issue as being different.

If implementing them this way seems to handicap them, I guess that's
probably to discourage their use, which seems reasonable to me; I bet
people would shoot themselves in the foot much more often than they
would help themselves with such options, we'd lose information which
might help to improve the optimizer, and the list would probably get a
pretty steady stream of "slow queries" which would turn out to be
(after digging deep enough) caused by people using hints that caused a
sub-optimal plan to be chosen.

-Kevin

#22

Alvaro Herrera

alvherre@commandprompt.com

over 16 years ago

In reply to: Tom Lane (#20)

Re: *_collapse_limit, geqo_threshold

Tom Lane escribiï¿½:

My own thought is that from_collapse_limit has more justification,
since it basically acts to stop a subquery from being flattened when
that would make the parent query too complex, and that seems like a
more understandable and justifiable behavior than treating JOIN
syntax specially.

Isn't that what we use OFFSET 0 for? That one has also the nice
property that you can actually specify which subquery you want to
prevent from being flattened.

Personally I have never seen a case where the collapse_limits were
useful tools.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#23

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Alvaro Herrera (#22)

Re: *_collapse_limit, geqo_threshold

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane escribi�:

My own thought is that from_collapse_limit has more justification,
since it basically acts to stop a subquery from being flattened when
that would make the parent query too complex, and that seems like a
more understandable and justifiable behavior than treating JOIN
syntax specially.

Isn't that what we use OFFSET 0 for? That one has also the nice
property that you can actually specify which subquery you want to
prevent from being flattened.

Well, if you want to modify your queries to prevent long planning times,
that'd be one way to do it. It doesn't seem like a generally useful
answer to me though. For example, typically the subquery would actually
be a view that might be used in various contexts. If you stick an
OFFSET in it then you disable flattening in all those contexts, likely
not the best answer.

Personally I have never seen a case where the collapse_limits were
useful tools.

I'm not convinced they're useful either.

regards, tom lane

#24

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#20)

Re: *_collapse_limit, geqo_threshold

On Jul 7, 2009, at 4:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I guess my point is that there's not a lot of obvious benefit in
allowing the functionality to exist but handicapping it so that it's
useful in as few cases as possible. If the consensus is that we want
half a feature (but not more or less than half), that's OK with me,
but it's not obvious to me why we should choose to want that.

Well, the question to my mind is whether the collapse_threshold GUCs
in
their current form actually represent a feature ;-). They were put
in pretty much entirely on speculation that someone might find them
useful. Your argument is that they are not only useless but a foot-
gun,
and so far we haven't got any clear contrary evidence. If we accept
that argument then we should take them out, not just change the
default.

My own thought is that from_collapse_limit has more justification,
since it basically acts to stop a subquery from being flattened when
that would make the parent query too complex, and that seems like a
more understandable and justifiable behavior than treating JOIN
syntax specially. But I'm fine with removing join_collapse_limit
or reducing it to a boolean.

That's pretty much where I am with it, too. The feature I was
referring to was not the collapse limits, but the ability to
explicitly specify the join order, which perhaps could be a useful
tool for reducing planning time or coping with bad estimates if you
could do it for only some of the joins in the query, but which we're
instead proposing to keep as an all-or-nothing flag.

...Robert

#25

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Robert Haas (#24)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> writes:

On Jul 7, 2009, at 4:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

My own thought is that from_collapse_limit has more justification,

That's pretty much where I am with it, too. The feature I was
referring to was not the collapse limits, but the ability to
explicitly specify the join order, which perhaps could be a useful
tool for reducing planning time or coping with bad estimates if you
could do it for only some of the joins in the query, but which we're
instead proposing to keep as an all-or-nothing flag.

It's pretty much all-or-nothing now: the GUC does not give you any sort
of useful control over *which* joins are reorderable.

regards, tom lane

#26

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#25)

Re: *_collapse_limit, geqo_threshold

On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Jul 7, 2009, at 4:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

My own thought is that from_collapse_limit has more justification,

That's pretty much where I am with it, too. The feature I was
referring to was not the collapse limits, but the ability to
explicitly specify the join order, which perhaps could be a useful
tool for reducing planning time or coping with bad estimates if you
could do it for only some of the joins in the query, but which we're
instead proposing to keep as an all-or-nothing flag.

It's pretty much all-or-nothing now: the GUC does not give you any sort
of useful control over *which* joins are reorderable.

Yes. So the way I see it, the options are:

1. We can remove join_collapse_limit completely and provide no
substitute. In this case, the ability to explicitly specify the join
order will be gone.
2. We can remove join_collapse_limit but provide a different, Boolean
GUC instead, like enable_join_reordering. In this case, we're not
actually reducing the number of GUCs, just the size of the foot-gun.
3. We can remove join_collapse_limit and provide an alternative way to
explicitly specify the join order that is more flexible. This both
reduces the number of GUCs and arguably provides some useful
functionality that we don't have now.

It sounds like your vote is for #2, which, as I say, seems like a
feature with one arm tied behind its back, but hey, what do I know?

Accepting that as the consensus in the absence of contrary votes, we
still need to decide what to do about from_collapse_threshold and
geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or
geqo_threshold, because the basic algorithm is clearly exponential
time and eventually you have to start worrying about that, but we
could raise the value. What to do about from_collapse_threshold is
less clear to me.

...Robert

#27

Jan Urbański

wulczer@wulczer.org

over 16 years ago

In reply to: Tom Lane (#7)

Re: *_collapse_limit, geqo_threshold

Tom Lane wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

I guess the question is whether there is anyone who has had a contrary
experience. (There must have been some benchmarks to justify adding
geqo at some point?)

The CVS history shows that geqo was integrated on 1997-02-19, which
I think means that it must have been developed against Postgres95

So while I don't doubt that geqo was absolutely essential when it was
written, it's fair to question whether it still provides a real win.
And we could definitely stand to take another look at the default
thresholds.

Well there is a TODO item about implementing an alternative to GEQO
(which is being treated more and more as the underdog of the project):
http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us

Would people be interested in someone working on that item?

Cheers,
Jan

#28

Noah Misch

noah@leadboat.com

over 16 years ago

In reply to: Kevin Grittner (#2)

1 attachment(s)

Re: *_collapse_limit, geqo_threshold

On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote:

I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing. Has anyone else ever seen such behavior? Can we get
examples? (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)

With joins between statistically indistinguishable columns, I see planning times
change by a factor of ~4 for each join added or removed (postgres 8.3). Varying
join_collapse_limit in the neighborhood of the actual number of joins has a
similar effect. See attachment with annotated timings. The example uses a
single table joined to itself, but using distinct tables with identical contents
yields the same figures.

The expontential factor seems smaller for real queries. I have a query of
sixteen joins that takes 71s to plan deterministically; it looks like this:

SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
JOIN t t0 ON fact.key = t.key AND t.x = MCV0
LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
JOIN t t2 ON fact.key = t.key AND t.x = MCV2
LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

For the real query, removing one join drops plan time to 26s, and removing two
drops the time to 11s. I don't have a good theory for the multiplier changing
from 4 for the trivial demonstration to ~2.5 for this real query. Re-enabling
geqo drops plan time to .5s. These tests used default_statistics_target = 1000,
but dropping that to 100 does not change anything dramatically.

I guess the question is whether there is anyone who has had a contrary
experience. (There must have been some benchmarks to justify adding
geqo at some point?)

I have queries with a few more joins (19-21), and I cancelled attempts to plan
them deterministically after 600+ seconds and 10+ GiB of memory usage. Even
with geqo_effort = 10, they plan within 5-15s with good results.

All that being said, I've never encountered a situation where a value other than
1 or <inf> for *_collapse_limit appeared optimal.

#29

Kenneth Marshall

ktm@rice.edu

over 16 years ago

In reply to: Noah Misch (#28)

Re: *_collapse_limit, geqo_threshold

Hi,

When I was first familiarizing myself with PostgreSQL, I took a
walk through its documentation on GECO and similar processes in
the literature. One big advantage of GECO is that you can trade
off planning time for plan optimization. I do agree that it should
be updated, but there were definite cases in the literature where
the planning time for exhaustive searches could take orders of
magnitude more time to execute than the differences in the execution
times of the differing plans.

My two cents,
Ken

Show quoted text

On Wed, Jul 08, 2009 at 09:43:12AM -0400, Noah Misch wrote:

On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote:

I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing. Has anyone else ever seen such behavior? Can we get
examples? (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)

With joins between statistically indistinguishable columns, I see planning times
change by a factor of ~4 for each join added or removed (postgres 8.3). Varying
join_collapse_limit in the neighborhood of the actual number of joins has a
similar effect. See attachment with annotated timings. The example uses a
single table joined to itself, but using distinct tables with identical contents
yields the same figures.

The expontential factor seems smaller for real queries. I have a query of
sixteen joins that takes 71s to plan deterministically; it looks like this:

SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
JOIN t t0 ON fact.key = t.key AND t.x = MCV0
LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
JOIN t t2 ON fact.key = t.key AND t.x = MCV2
LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

For the real query, removing one join drops plan time to 26s, and removing two
drops the time to 11s. I don't have a good theory for the multiplier changing
from 4 for the trivial demonstration to ~2.5 for this real query. Re-enabling
geqo drops plan time to .5s. These tests used default_statistics_target = 1000,
but dropping that to 100 does not change anything dramatically.

I guess the question is whether there is anyone who has had a contrary
experience. (There must have been some benchmarks to justify adding
geqo at some point?)

I have queries with a few more joins (19-21), and I cancelled attempts to plan
them deterministically after 600+ seconds and 10+ GiB of memory usage. Even
with geqo_effort = 10, they plan within 5-15s with good results.

All that being said, I've never encountered a situation where a value other than
1 or <inf> for *_collapse_limit appeared optimal.

nm

SET geqo = off;
SET join_collapse_limit = 100;
CREATE TEMP TABLE t AS SELECT * FROM generate_series(1, 1000) f(n); ANALYZE t;

--- Vary join count
-- 242.4s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11
NATURAL JOIN t t12;
-- 31.2s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11;
-- 8.1s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 2.0s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09;
-- 0.5s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08;

--- Vary join_collapse_limit
-- 8.1s
SET join_collapse_limit = 100;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 8.0s
SET join_collapse_limit = 11;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 2.2s
SET join_collapse_limit = 10;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 0.5s
SET join_collapse_limit = 9;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 0.1s
SET join_collapse_limit = 8;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;

#30

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Robert Haas (#26)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

It's pretty much all-or-nothing now: the GUC does not give you any sort
of useful control over *which* joins are reorderable.

Yes. So the way I see it, the options are:

1. We can remove join_collapse_limit completely and provide no
substitute. In this case, the ability to explicitly specify the join
order will be gone.

2. We can remove join_collapse_limit but provide a different, Boolean
GUC instead, like enable_join_reordering. In this case, we're not
actually reducing the number of GUCs, just the size of the foot-gun.

3. We can remove join_collapse_limit and provide an alternative way to
explicitly specify the join order that is more flexible. This both
reduces the number of GUCs and arguably provides some useful
functionality that we don't have now.

It sounds like your vote is for #2, which, as I say, seems like a
feature with one arm tied behind its back, but hey, what do I know?

Well, the reason I'm not voting for #3 is that it looks like a lot of
work to implement something that would basically be a planner hint,
which I'm generally against; furthermore, it's a hint that there's been
no demand for. (We're not even certain that anyone is using the ability
to *fully* specify the join order, much less wanting some undetermined
compromise between manual and automatic control.) And anyway I didn't
hear anyone volunteering to do it. So the realistic alternatives are
#1, #2, or "do nothing"; and out of those I like #2.

Accepting that as the consensus in the absence of contrary votes, we
still need to decide what to do about from_collapse_threshold and
geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or
geqo_threshold, because the basic algorithm is clearly exponential
time and eventually you have to start worrying about that, but we
could raise the value. What to do about from_collapse_threshold is
less clear to me.

I do not think there is a good argument for eliminating geqo_threshold.
There might well be an argument for cranking up its default value;
but that would take some hard data, which seems lacking at the moment.

I'm on the fence about from_collapse_threshold. The argument for having
it seems to be that there might be cases where not folding a subquery
is preferable to folding it and then taking your chances with GEQO.
But I'm not really convinced there are any.

It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query. You might get a good plan or an awful one, but at least
it'd be the same one each time. DBAs like predictability.

regards, tom lane

#31

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Tom Lane (#30)

Re: *_collapse_limit, geqo_threshold

Tom Lane <tgl@sss.pgh.pa.us> wrote:

It occurs to me that one way to make GEQO less scary would be to
take out the nondeterminism by resetting its random number generator
for each query. You might get a good plan or an awful one, but at
least it'd be the same one each time. DBAs like predictability.

+1 The biggest reason that I've tended to avoid geqo is that I would
never know when it might do something really stupid with a query one
time out of some large number, leading to mysterious complaints which
could eat a lot of time.

For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea....

-Kevin

#32

Kenneth Marshall

ktm@rice.edu

over 16 years ago

In reply to: Kevin Grittner (#31)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 08, 2009 at 04:13:11PM -0500, Kevin Grittner wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

It occurs to me that one way to make GEQO less scary would be to
take out the nondeterminism by resetting its random number generator
for each query. You might get a good plan or an awful one, but at
least it'd be the same one each time. DBAs like predictability.

+1 The biggest reason that I've tended to avoid geqo is that I would
never know when it might do something really stupid with a query one
time out of some large number, leading to mysterious complaints which
could eat a lot of time.

For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea....

-Kevin

+1 I like the idea of a session GUC for the random number seed. If
we can come up with a way to prune the search space more aggressively,
GECO (or GECO2) will be much less prone to generating a bad plan.
I also think that a session variable would make it easier to test
GECO* by removing the nondeteminism.

Ken

#33

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Noah Misch (#28)

Re: *_collapse_limit, geqo_threshold

Noah Misch <noah@leadboat.com> writes:

With joins between statistically indistinguishable columns, I see planning times
change by a factor of ~4 for each join added or removed (postgres 8.3). Varying
join_collapse_limit in the neighborhood of the actual number of joins has a
similar effect. See attachment with annotated timings. The example uses a
single table joined to itself, but using distinct tables with identical contents
yields the same figures.

Hey, some hard data! Thanks for doing that.

The expontential factor seems smaller for real queries. I have a query of
sixteen joins that takes 71s to plan deterministically; it looks like this:

SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
JOIN t t0 ON fact.key = t.key AND t.x = MCV0
LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
JOIN t t2 ON fact.key = t.key AND t.x = MCV2
LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

I'm confused here --- I think you must have over-anonymized your query.
Surely the ON conditions for the left joins should be referencing t3,
t4, etc?

For the real query, removing one join drops plan time to 26s, and
removing two drops the time to 11s. I don't have a good theory for
the multiplier changing from 4 for the trivial demonstration to ~2.5
for this real query.

The rule of thumb that says that an n-way join requires 2^n work is only
true if we consider every single combination of possible joins, which
normally we don't. The planner prefers join paths that follow join
clauses, and will only consider clauseless joins when it has no other
choice. I believe that real queries tend to be pretty non-flat in this
space and so the number of join paths to consider is a lot less than 2^n.
Your synthesized query, on the other hand, allows any relation to be
joined to any other --- it might not look that way, but after creating
derived equalities there will be a potential join clause linking every
relation to every other one. So I think you were testing the worst case,
and I'm not surprised that more-typical queries would show a slower
growth curve.

This is why I don't trust benchmarking the planner on artificial
queries. Still, I appreciate your work, because it gives us at least
some hard data to go on.

regards, tom lane

#34

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Kevin Grittner (#31)

Re: *_collapse_limit, geqo_threshold

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea....

If memory serves, we actually had exactly that at some point. But I
think the reason it got taken out was that it interfered with the
behavior of the random() function for everything else. We'd have to
give GEQO its own private random number generator.

regards, tom lane

#35

Kenneth Marshall

ktm@rice.edu

over 16 years ago

In reply to: Tom Lane (#34)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 08, 2009 at 05:46:02PM -0400, Tom Lane wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea....

If memory serves, we actually had exactly that at some point. But I
think the reason it got taken out was that it interfered with the
behavior of the random() function for everything else. We'd have to
give GEQO its own private random number generator.

regards, tom lane

A separate random number generator for GECO make a lot of sense.

Cheers,
Ken

#36

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#30)

Re: *_collapse_limit, geqo_threshold

On Jul 8, 2009, at 3:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

It's pretty much all-or-nothing now: the GUC does not give you any
sort
of useful control over *which* joins are reorderable.

Yes. So the way I see it, the options are:

1. We can remove join_collapse_limit completely and provide no
substitute. In this case, the ability to explicitly specify the join
order will be gone.

2. We can remove join_collapse_limit but provide a different, Boolean
GUC instead, like enable_join_reordering. In this case, we're not
actually reducing the number of GUCs, just the size of the foot-gun.

3. We can remove join_collapse_limit and provide an alternative way
to
explicitly specify the join order that is more flexible. This both
reduces the number of GUCs and arguably provides some useful
functionality that we don't have now.

It sounds like your vote is for #2, which, as I say, seems like a
feature with one arm tied behind its back, but hey, what do I know?

Well, the reason I'm not voting for #3 is that it looks like a lot of
work to implement something that would basically be a planner hint,
which I'm generally against; furthermore, it's a hint that there's
been
no demand for. (We're not even certain that anyone is using the
ability
to *fully* specify the join order, much less wanting some undetermined
compromise between manual and automatic control.) And anyway I didn't
hear anyone volunteering to do it. So the realistic alternatives are
#1, #2, or "do nothing"; and out of those I like #2.

That was my first reaction too, but now I'm wondering whether we
shouldn't just do #1. #2 is a planner hint, too, just not a very good
one. If, as you suggest, it isn't actually useful, then why keep it
at all? (On the other hand, if someone thinks they need it, it would
be interesting to know the use case, and think about the best way to
address it.)

Accepting that as the consensus in the absence of contrary votes, we
still need to decide what to do about from_collapse_threshold and
geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or
geqo_threshold, because the basic algorithm is clearly exponential
time and eventually you have to start worrying about that, but we
could raise the value. What to do about from_collapse_threshold is
less clear to me.

I do not think there is a good argument for eliminating
geqo_threshold.
There might well be an argument for cranking up its default value;
but that would take some hard data, which seems lacking at the moment.

I'm on the fence about from_collapse_threshold. The argument for
having
it seems to be that there might be cases where not folding a subquery
is preferable to folding it and then taking your chances with GEQO.
But I'm not really convinced there are any.

Me either. You could probably get the same effect in other ways if
you actually needed it, like OFFSET 0 or wrapping the subquery in a
SRF. I'm leaning more and more toward thinking we should just nuke it.

It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query. You might get a good plan or an awful one, but at least
it'd be the same one each time. DBAs like predictability.

Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.

...Robert

#37

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Robert Haas (#36)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> writes:

That was my first reaction too, but now I'm wondering whether we
shouldn't just do #1. #2 is a planner hint, too, just not a very good
one. If, as you suggest, it isn't actually useful, then why keep it
at all? (On the other hand, if someone thinks they need it, it would
be interesting to know the use case, and think about the best way to
address it.)

Well, I can cite one reasonably plausible use case: when you have an
umpteen-way join you need to execute a lot, and you don't want to pay
for an exhaustive search, but GEQO doesn't reliably find a good plan.
What you can do is let the system do an exhaustive search once to find
the best plan, then you rearrange the query to specify that join order
via JOINs, and turn off join collapsing. Presto, good plan every time
with very little planning time expended.

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure. No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness. Whereas what I describe above is something that costs
us nearly nothing to provide. The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.

It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query. You might get a good plan or an awful one, but at least
it'd be the same one each time. DBAs like predictability.

Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.

I was imagining a GUC that would make the reset optional, in case anyone
really does want to have unstable plans. I don't have much doubt about
what typical users will prefer though.

regards, tom lane

#38

Joshua Tolley

eggyknap@gmail.com

over 16 years ago

In reply to: Tom Lane (#37)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 08, 2009 at 09:26:35PM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

That was my first reaction too, but now I'm wondering whether we
shouldn't just do #1. #2 is a planner hint, too, just not a very good
one. If, as you suggest, it isn't actually useful, then why keep it
at all? (On the other hand, if someone thinks they need it, it would
be interesting to know the use case, and think about the best way to
address it.)

Well, I can cite one reasonably plausible use case: when you have an
umpteen-way join you need to execute a lot, and you don't want to pay
for an exhaustive search, but GEQO doesn't reliably find a good plan.
What you can do is let the system do an exhaustive search once to find
the best plan, then you rearrange the query to specify that join order
via JOINs, and turn off join collapsing. Presto, good plan every time
with very little planning time expended.

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure. No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness. Whereas what I describe above is something that costs
us nearly nothing to provide. The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.

This sounds like planner hints to me. The argument against hinting, AIUI, is
that although the plan you've guaranteed via hints may be a good one today,
when the data change a bit your carefully crafted plan happens to become a bad
one, but you're no longer around to change the hints accordingly. Entire
stored plans, or predetermined seeds for GEQO's random number generator all
boil down to saying, "I want you to use this plan henceforth and forever."

It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query. You might get a good plan or an awful one, but at least
it'd be the same one each time. DBAs like predictability.

Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.

I was imagining a GUC that would make the reset optional, in case anyone
really does want to have unstable plans. I don't have much doubt about
what typical users will prefer though.

Do we know that GEQO plans are, in reality, less stable than than usual
planner? Certainly on paper it appears they could be, but the mailing lists
are full of emails about "this query's plan changed and performance suddenly
tanked; how do I fix it?" so I'm unconvinced this is a problem unique to GEQO.
Which in turn boils down to "we need real world data to look at".

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

#39

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Joshua Tolley (#38)

Re: *_collapse_limit, geqo_threshold

Joshua Tolley <eggyknap@gmail.com> writes:

This sounds like planner hints to me. The argument against hinting, AIUI, is
that although the plan you've guaranteed via hints may be a good one today,
when the data change a bit your carefully crafted plan happens to become a bad
one, but you're no longer around to change the hints accordingly.

That's one argument against them. Another one is that time put into
developing a planner hints mechanism is time that would be better spent
on fixing the underlying planner problem. However, that second argument
doesn't apply to something like join_collapse_limit, whose
implementation is pretty nearly a one-liner (as are the various existing
enable_whatever switches). Again, it's all about cost/benefit ratio.

Do we know that GEQO plans are, in reality, less stable than than usual
planner?

Yes, we do. There aren't that many examples in the archives, but that
likely is because join_collapse_limit and from_collapse_limit are by
default set to try to prevent use of GEQO. The most recent clear-cut
example I can find is
http://archives.postgresql.org/pgsql-general/2008-10/msg01449.php
wherein the guy has apparently written a "flat" FROM of 17 tables,
and so neither collapse_limit will kick in. If we get rid of the
collapse_limits as proposed in this thread, you'll start seeing
those sorts of complaints all over the place, unless we make
GEQO deterministic.

regards, tom lane

#40

Noah Misch

noah@leadboat.com

over 16 years ago

In reply to: Tom Lane (#33)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 08, 2009 at 05:23:16PM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

The expontential factor seems smaller for real queries. I have a query of
sixteen joins that takes 71s to plan deterministically; it looks like this:

SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
JOIN t t0 ON fact.key = t.key AND t.x = MCV0
LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
JOIN t t2 ON fact.key = t.key AND t.x = MCV2
LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

I'm confused here --- I think you must have over-anonymized your query.
Surely the ON conditions for the left joins should be referencing t3,
t4, etc?

Yes. Aside from that error, I picked the wrong features to emphasize and gloss
over. Only two relations (amidst `dim0 ... dim6') resemble dimensions, having
an implied relative position on the plan tree. The other fourteen relations
share a join key. That explains the distinctive plan cost; this query resembles
the utterly unconstrained artificial query to a larger degree than most.

For the real query, removing one join drops plan time to 26s, and
removing two drops the time to 11s. I don't have a good theory for
the multiplier changing from 4 for the trivial demonstration to ~2.5
for this real query.

The rule of thumb that says that an n-way join requires 2^n work is only
true if we consider every single combination of possible joins, which
normally we don't. The planner prefers join paths that follow join
clauses, and will only consider clauseless joins when it has no other
choice. I believe that real queries tend to be pretty non-flat in this
space and so the number of join paths to consider is a lot less than 2^n.
Your synthesized query, on the other hand, allows any relation to be
joined to any other --- it might not look that way, but after creating
derived equalities there will be a potential join clause linking every
relation to every other one. So I think you were testing the worst case,
and I'm not surprised that more-typical queries would show a slower
growth curve.

Describing in those terms illuminates much. While the concepts do suggest 2^N
worst-case planning cost, my artificial test case showed a rigid 4^N pattern;
what could explain that?

Thanks,
nm

#41

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Noah Misch (#40)

Re: *_collapse_limit, geqo_threshold

Noah Misch <noah@leadboat.com> writes:

Describing in those terms illuminates much. While the concepts do suggest 2^N
worst-case planning cost, my artificial test case showed a rigid 4^N pattern;
what could explain that?

Well, the point of the 2^N concept is just that adding one more relation
multiplies the planning work by a constant factor. It's useful data
that you find the factor to be about 4, but I wouldn't have expected the
model to tell us that.

regards, tom lane

#42

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#37)

Re: *_collapse_limit, geqo_threshold

On Jul 8, 2009, at 8:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

That was my first reaction too, but now I'm wondering whether we
shouldn't just do #1. #2 is a planner hint, too, just not a very
good
one. If, as you suggest, it isn't actually useful, then why keep it
at all? (On the other hand, if someone thinks they need it, it would
be interesting to know the use case, and think about the best way to
address it.)

Well, I can cite one reasonably plausible use case: when you have an
umpteen-way join you need to execute a lot, and you don't want to pay
for an exhaustive search, but GEQO doesn't reliably find a good plan.
What you can do is let the system do an exhaustive search once to find
the best plan, then you rearrange the query to specify that join order
via JOINs, and turn off join collapsing. Presto, good plan every time
with very little planning time expended.

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure. No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness. Whereas what I describe above is something that costs
us nearly nothing to provide. The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.

I don't think that would be my answer because plan caching sounds
hard. It would be nice to have, though. :-)

But it's clearly a planner hint, however you slice it.

It occurs to me that one way to make GEQO less scary would be to
take
out the nondeterminism by resetting its random number generator for
each query. You might get a good plan or an awful one, but at least
it'd be the same one each time. DBAs like predictability.

Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.

I was imagining a GUC that would make the reset optional, in case
anyone
really does want to have unstable plans. I don't have much doubt
about
what typical users will prefer though.

OK.

...Robert

#43

Dimitri Fontaine

dfontaine@hi-media.com

over 16 years ago

In reply to: Robert Haas (#42)

Re: *_collapse_limit, geqo_threshold

Hi,

Robert Haas <robertmhaas@gmail.com> writes:

On Jul 8, 2009, at 8:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure. No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness. Whereas what I describe above is something that costs
us nearly nothing to provide. The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.

I don't think that would be my answer because plan caching sounds hard. It
would be nice to have, though. :-)

In fact I think marshalling plans (and unmarshalling them) would rather
be the easy part of it, what looks very hard is the problem already
mentioned here in another context (gathering statistics on views):
http://archives.postgresql.org/pgsql-performance/2009-06/msg00118.php

How to match a random user query with the stored plans with as little
analyze as possible?

But it's clearly a planner hint, however you slice it.

Agreed.

Regards,
--
dim

#44

Noah Misch

noah@leadboat.com

over 16 years ago

In reply to: Robert Haas (#1)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 08, 2009 at 10:28:02AM -0500, Robert Haas wrote:

On Jul 8, 2009, at 8:43 AM, Noah Misch <noah@leadboat.com> wrote:

The expontential factor seems smaller for real queries. I have a query of
sixteen joins that takes 71s to plan deterministically; it looks like this:

SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
JOIN t t0 ON fact.key = t.key AND t.x = MCV0
LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
JOIN t t2 ON fact.key = t.key AND t.x = MCV2
LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

Very interesting! I am guessing that the problem here is that all the
inner joins commute, as do all the left joins, so there are a lot of
possibilities to explore. I also suspect that the actual join order
doesn't matter much, so it's a good candidate for GEQO. Even if you had
some restriction clauses against your dimension/t tables, that would
probably just mean that you want to do those joins first, and the rest
afterwards, which still seems like it ought to be OK for GEQO.

But, in many queries this size, the join order is more constrained.
Some of the later joins depend on the tables added by the earlier ones,
rather than all referring back to the base table. Is there some way we
could estimate the number of join offerings we'll have to consider
relatively cheaply? That might be a better metric than # of tables.

Observing the pathological case taking plan time proportional to 4^N, apply
geqo_threshold as "use GEQO when comparing more than geqo_threshold * 4^N join
order possibilities"? I'm not sure whether that figure is available (early
enough) to factor into the decision. Such a change would probably imply a lower
default geqo_threshold, around 9 to 11. The number of typical queries subject
to GEQO would nonetheless decrease.

Import Notes

Reply to msg id not found: A99127D1-238D-400E-8680-76F4FEA4D903@gmail.com

#45

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Robert Haas (#36)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> wrote:

If, as you suggest, it isn't actually useful, then why keep it at
all?

I was imagining that someone who has a query which is taking a long
time to run, and asserts that it would run faster if only the
optimizer would arrange the joins a certain way, could test that
theory, as part of the diagnostic process. (In other words, for
similar reasons that the other enable_* GUC settings exist.)

I would certainly not want to use it in production, and wouldn't
expect that would normally be a very good idea.

-Kevin

#46

Peter Hunsberger

peter.hunsberger@gmail.com

over 16 years ago

In reply to: Tom Lane (#37)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 8, 2009 at 8:26 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

That was my first reaction too, but now I'm wondering whether we
shouldn't just do #1. #2 is a planner hint, too, just not a very good
one. If, as you suggest, it isn't actually useful, then why keep it
at all? (On the other hand, if someone thinks they need it, it would
be interesting to know the use case, and think about the best way to
address it.)

Well, I can cite one reasonably plausible use case: when you have an
umpteen-way join you need to execute a lot, and you don't want to pay
for an exhaustive search, but GEQO doesn't reliably find a good plan.
What you can do is let the system do an exhaustive search once to find
the best plan, then you rearrange the query to specify that join order
via JOINs, and turn off join collapsing. Presto, good plan every time
with very little planning time expended.

In the Oracle world they use the hint "ORDERED" for this purpose. As
the Oracle optimizer has gotten smarter over the years I've seen less
need to use it, but over all, compared to other Oracle hints it does
not seem to be extremely
sensitive to data / index / stats changes. When you think about why
such ordering might work that makes sense to me: small tables can be
used early to prune large tables later on. Typically, these smaller
tables are static config info type data. Eg. pick species, then choose
which of the 10 million pathology samples you have match that species.

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure. No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness. Whereas what I describe above is something that costs
us nearly nothing to provide. The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.

Again Oracle has a mechanism for doing this. Don't know the details,
but our DBA would if anyone cares...

--
Peter Hunsberger

#47

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#5)

4 attachment(s)

Re: *_collapse_limit, geqo_threshold - example schema

On Tuesday 07 July 2009 17:40:50 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I cannot reasonably plan some queries with join_collapse_limit set to 20.
At least not without setting the geqo limit very low and a geqo_effort to
a low value.
So I would definitely not agree that removing j_c_l is a good idea.

Can you show some specific examples? All of this discussion seems like
speculation in a vacuum ...

As similar wishes came up multiple times now I started to create a schema I
may present which is sufficiently similar to show the same effects.

I had to cut down the complexity of the schema considerably - both for easier
understanding and easier writing of the demo schema.

I also have a moderately complex demo query similar to really used ones.
Autogenerated (GUI) queries do not use views like I did in the example one but
it seemed easier to play around with query size this way.
Also the real queries often have way much more conditions than the one I
present here.
Also I have not "tuned" the queries here in any way, the join order is not
optimized (like in the real application), but I don't think that does matter
for the purpose of this discussion
The queries itself only sketch what they are intended for and query many
fictional datapoints, but again I dont think this is a problem.

Is it helpfull this way?

Some numbers about the query_2.sql are attached. Short overview:
- a low from_collapse_limit is deadly
- a high from_collapse_limit is not costly here
- geqo_effort basically changes nothing
- geqo changes basically nothing
- with a higher join_collapse_limit (12) geqo=on costs quite a bit! (factor
20!). I double checked. At other times I get 'failed to make a valid plan'

The numbers are all 8.5 as of today.

Some explanations about the schema:
- It uses surrogate keys everywhere as the real schema employs some form of
row level, label based access checking (covert channel issues)
- The real schema uses partitions - I don't think they would be interesting
here?
- its definitely not the most beautiful schema I have seen, but I have to admit
that I cannot think of a much nicer one which serves the different purposes as
well
- somewhat complex queries
- new "information_set"'s and "information"'s are added frequently
- automated and manual data entry has to work with such additions
- GUI query tool needs to work in face of such changes
- I have seen similar schemas multiple times now.
- The original schema employs materialized views in parts (both for execution
and planning speed)
- The queries are crazy, but people definitely create/use them.

Andres

#48

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Joshua Tolley (#38)

Re: *_collapse_limit, geqo_threshold

Resending to list due to failure:
451 4.3.5 Server configuration problem

Joshua Tolley <eggyknap@gmail.com> wrote:

Entire stored plans, or predetermined seeds for GEQO's random number
generator all boil down to saying, "I want you to use this plan
henceforth and forever."

A predetermined seed for geqo's random number generator only
guarantees that you get the same plan for the same query as long as
the statistics remain the same. After the next ANALYZE, you may still
get a different plan.

Do we know that GEQO plans are, in reality, less stable than than
usual planner?

Yeah, I had a complaint of a slow query. When I ran EXPLAIN ANALYZE
on it I ran it several times (as I usually do, to establish what
happens with and without caching). I got different plans. So I ran
the query a bunch of times and got a decent plan about 90% of the
time, and an awful one about 10% of the time. It was clearly a geqo
effect, so I didn't post on it. (Why post? The product was clearly
operating as intended.) There was an easy solution; I disabled geqo.
I'm sure it's useful to some; we haven't missed it.

-Kevin

#49

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Kevin Grittner (#48)

Re: *_collapse_limit, geqo_threshold

On Thursday 09 July 2009 17:37:41 Kevin Grittner wrote:

Resending to list due to failure:
451 4.3.5 Server configuration problem

Joshua Tolley <eggyknap@gmail.com> wrote:

Entire stored plans, or predetermined seeds for GEQO's random number
generator all boil down to saying, "I want you to use this plan
henceforth and forever."

A predetermined seed for geqo's random number generator only
guarantees that you get the same plan for the same query as long as
the statistics remain the same. After the next ANALYZE, you may still
get a different plan.

Do we know that GEQO plans are, in reality, less stable than than
usual planner?

Yes definitely. I.e. in the referenced schema (somewhere else in the thread) I
even have queries which fail to plan only sometimes.

Andres

#50

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#30)

1 attachment(s)

Re: *_collapse_limit, geqo_threshold

On Wed, Jul 8, 2009 at 4:57 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Well, the reason I'm not voting for #3 is that it looks like a lot of
work to implement something that would basically be a planner hint,
which I'm generally against; furthermore, it's a hint that there's been
no demand for. (We're not even certain that anyone is using the ability
to *fully* specify the join order, much less wanting some undetermined
compromise between manual and automatic control.) And anyway I didn't
hear anyone volunteering to do it. So the realistic alternatives are
#1, #2, or "do nothing"; and out of those I like #2.

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

Patch attached.

...Robert

Attachments:

collapse_limit.patchtext/x-diff; charset=US-ASCII; name=collapse_limit.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 2251,2313 **** SELECT * FROM parent WHERE key = 2400;
        </listitem>
       </varlistentry>
  
-      <varlistentry id="guc-from-collapse-limit" xreflabel="from_collapse_limit">
-       <term><varname>from_collapse_limit</varname> (<type>integer</type>)</term>
-       <indexterm>
-        <primary><varname>from_collapse_limit</> configuration parameter</primary>
-       </indexterm>
-       <listitem>
-        <para>
-         The planner will merge sub-queries into upper queries if the
-         resulting <literal>FROM</literal> list would have no more than
-         this many items.  Smaller values reduce planning time but might
-         yield inferior query plans.  The default is eight.
-         For more information see <xref linkend="explicit-joins">.
-        </para>
- 
-        <para>
-         Setting this value to <xref linkend="guc-geqo-threshold"> or more
-         may trigger use of the GEQO planner, resulting in nondeterministic
-         plans.  See <xref linkend="runtime-config-query-geqo">.
-        </para>
-       </listitem>
-      </varlistentry>
- 
-      <varlistentry id="guc-join-collapse-limit" xreflabel="join_collapse_limit">
-       <term><varname>join_collapse_limit</varname> (<type>integer</type>)</term>
-       <indexterm>
-        <primary><varname>join_collapse_limit</> configuration parameter</primary>
-       </indexterm>
-       <listitem>
-        <para>
-         The planner will rewrite explicit <literal>JOIN</>
-         constructs (except <literal>FULL JOIN</>s) into lists of
-         <literal>FROM</> items whenever a list of no more than this many items
-         would result.  Smaller values reduce planning time but might
-         yield inferior query plans.
-        </para>
- 
-        <para>
-         By default, this variable is set the same as
-         <varname>from_collapse_limit</varname>, which is appropriate
-         for most uses. Setting it to 1 prevents any reordering of
-         explicit <literal>JOIN</>s. Thus, the explicit join order
-         specified in the query will be the actual order in which the
-         relations are joined. The query planner does not always choose
-         the optimal join order; advanced users can elect to
-         temporarily set this variable to 1, and then specify the join
-         order they desire explicitly.
-         For more information see <xref linkend="explicit-joins">.
-        </para>
- 
-        <para>
-         Setting this value to <xref linkend="guc-geqo-threshold"> or more
-         may trigger use of the GEQO planner, resulting in nondeterministic
-         plans.  See <xref linkend="runtime-config-query-geqo">.
-        </para>
-       </listitem>
-      </varlistentry>
- 
       </variablelist>
      </sect2>
     </sect1>
--- 2251,2256 ----
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***************
*** 599,606 **** WHERE tablename = 'road';
  
    <para>
     It is possible
!    to control the query planner to some extent by using the explicit <literal>JOIN</>
!    syntax.  To see why this matters, we first need some background.
    </para>
  
    <para>
--- 599,607 ----
  
    <para>
     It is possible
!    to control the query planner to some extent by using <literal>JOIN</>
!    with the <literal>FORCE</> keyword.  To see why this matters, we first need
!    some background.
    </para>
  
    <para>
***************
*** 675,681 **** SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
    <para>
     Even though most kinds of <literal>JOIN</> don't completely constrain
     the join order, it is possible to instruct the
!    <productname>PostgreSQL</productname> query planner to treat all
     <literal>JOIN</> clauses as constraining the join order anyway.
     For example, these three queries are logically equivalent:
  <programlisting>
--- 676,682 ----
    <para>
     Even though most kinds of <literal>JOIN</> don't completely constrain
     the join order, it is possible to instruct the
!    <productname>PostgreSQL</productname> query planner to treat certain
     <literal>JOIN</> clauses as constraining the join order anyway.
     For example, these three queries are logically equivalent:
  <programlisting>
***************
*** 683,710 **** SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
  SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
  SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
  </programlisting>
-    But if we tell the planner to honor the <literal>JOIN</> order,
-    the second and third take less time to plan than the first.  This effect
-    is not worth worrying about for only three tables, but it can be a
-    lifesaver with many tables.
    </para>
  
    <para>
     To force the planner to follow the join order laid out by explicit
!    <literal>JOIN</>s,
!    set the <xref linkend="guc-join-collapse-limit"> run-time parameter to 1.
!    (Other possible values are discussed below.)
    </para>
  
    <para>
     You do not need to constrain the join order completely in order to
!    cut search time, because it's OK to use <literal>JOIN</> operators
!    within items of a plain <literal>FROM</> list.  For example, consider:
  <programlisting>
! SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
  </programlisting>
!    With <varname>join_collapse_limit</> = 1, this
!    forces the planner to join A to B before joining them to other tables,
     but doesn't constrain its choices otherwise.  In this example, the
     number of possible join orders is reduced by a factor of 5.
    </para>
--- 684,710 ----
  SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
  SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
  </programlisting>
    </para>
  
    <para>
     To force the planner to follow the join order laid out by explicit
!    <literal>JOIN</>s, use the <literal>FORCE</> keyword, like this:
! <programlisting>
! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
! </programlisting>
!    Because there is only one possible join order, this query will take less
!    time to plan.  This effect is not worth worrying about for only three
!    tables, but it can be a lifesaver with many tables.
    </para>
  
    <para>
     You do not need to constrain the join order completely in order to
!    cut search time; <literal>FORCE</> can be specified for just some of
!    the joins in a given query.  For example, consider:
  <programlisting>
! SELECT * FROM a INNER FORCE JOIN b ON ..., c, d, e WHERE ...;
  </programlisting>
!    This forces the planner to join A to B before joining them to other tables,
     but doesn't constrain its choices otherwise.  In this example, the
     number of possible join orders is reduced by a factor of 5.
    </para>
***************
*** 713,719 **** SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
     Constraining the planner's search in this way is a useful technique
     both for reducing planning time and for directing the planner to a
     good query plan.  If the planner chooses a bad join order by default,
!    you can force it to choose a better order via <literal>JOIN</> syntax
     &mdash; assuming that you know of a better order, that is.  Experimentation
     is recommended.
    </para>
--- 713,719 ----
     Constraining the planner's search in this way is a useful technique
     both for reducing planning time and for directing the planner to a
     good query plan.  If the planner chooses a bad join order by default,
!    you can force it to choose a better order using <literal>FORCE</>
     &mdash; assuming that you know of a better order, that is.  Experimentation
     is recommended.
    </para>
***************
*** 729,736 **** WHERE somethingelse;
  </programlisting>
     This situation might arise from use of a view that contains a join;
     the view's <literal>SELECT</> rule will be inserted in place of the view
!    reference, yielding a query much like the above.  Normally, the planner
!    will try to collapse the subquery into the parent, yielding:
  <programlisting>
  SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
  </programlisting>
--- 729,736 ----
  </programlisting>
     This situation might arise from use of a view that contains a join;
     the view's <literal>SELECT</> rule will be inserted in place of the view
!    reference, yielding a query much like the above.  The planner
!    will always try to collapse the subquery into the parent, yielding:
  <programlisting>
  SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
  </programlisting>
***************
*** 741,765 **** SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
     we have increased the planning time; here, we have a five-way join
     problem replacing two separate three-way join problems.  Because of the
     exponential growth of the number of possibilities, this makes a big
!    difference.  The planner tries to avoid getting stuck in huge join search
!    problems by not collapsing a subquery if more than <varname>from_collapse_limit</>
!    <literal>FROM</> items would result in the parent
!    query.  You can trade off planning time against quality of plan by
!    adjusting this run-time parameter up or down.
!   </para>
! 
!   <para>
!    <xref linkend="guc-from-collapse-limit"> and <xref
!    linkend="guc-join-collapse-limit">
!    are similarly named because they do almost the same thing: one controls
!    when the planner will <quote>flatten out</> subqueries, and the
!    other controls when it will flatten out explicit joins.  Typically
!    you would either set <varname>join_collapse_limit</> equal to
!    <varname>from_collapse_limit</> (so that explicit joins and subqueries
!    act similarly) or set <varname>join_collapse_limit</> to 1 (if you want
!    to control join order with explicit joins).  But you might set them
!    differently if you are trying to fine-tune the trade-off between planning
!    time and run time.
    </para>
   </sect1>
  
--- 741,748 ----
     we have increased the planning time; here, we have a five-way join
     problem replacing two separate three-way join problems.  Because of the
     exponential growth of the number of possibilities, this makes a big
!    difference.  For large join problems, you may need to constrain the join
!    order or use <xref linkend="guc-geqo-threshold">.
    </para>
   </sect1>
  
*** a/doc/src/sgml/ref/select.sgml
--- b/doc/src/sgml/ref/select.sgml
***************
*** 357,372 **** TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
          One of
          <itemizedlist>
           <listitem>
!           <para><literal>[ INNER ] JOIN</literal></para>
           </listitem>
           <listitem>
!           <para><literal>LEFT [ OUTER ] JOIN</literal></para>
           </listitem>
           <listitem>
!           <para><literal>RIGHT [ OUTER ] JOIN</literal></para>
           </listitem>
           <listitem>
!           <para><literal>FULL [ OUTER ] JOIN</literal></para>
           </listitem>
           <listitem>
            <para><literal>CROSS JOIN</literal></para>
--- 357,372 ----
          One of
          <itemizedlist>
           <listitem>
!           <para><literal>[ INNER [ FORCE ] ] JOIN</literal></para>
           </listitem>
           <listitem>
!           <para><literal>LEFT [ OUTER ] [ FORCE ] JOIN</literal></para>
           </listitem>
           <listitem>
!           <para><literal>RIGHT [ OUTER ] [ FORCE ] JOIN</literal></para>
           </listitem>
           <listitem>
!           <para><literal>FULL [ OUTER ] [ FORCE ] JOIN</literal></para>
           </listitem>
           <listitem>
            <para><literal>CROSS JOIN</literal></para>
***************
*** 389,395 **** TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
          determine the order of nesting.  In the absence of parentheses,
          <literal>JOIN</literal>s nest left-to-right.  In any case
          <literal>JOIN</literal> binds more tightly than the commas
!         separating <literal>FROM</> items.
         </para>
  
         <para>
--- 389,397 ----
          determine the order of nesting.  In the absence of parentheses,
          <literal>JOIN</literal>s nest left-to-right.  In any case
          <literal>JOIN</literal> binds more tightly than the commas
!         separating <literal>FROM</> items.  The keyword
!         <literal>FORCE</literal> constrains the join order
!         (see <xref linkend="explicit-joins">).
         </para>
  
         <para>
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***************
*** 1542,1547 **** _copyJoinExpr(JoinExpr *from)
--- 1542,1548 ----
  
  	COPY_SCALAR_FIELD(jointype);
  	COPY_SCALAR_FIELD(isNatural);
+ 	COPY_SCALAR_FIELD(isForce);
  	COPY_NODE_FIELD(larg);
  	COPY_NODE_FIELD(rarg);
  	COPY_NODE_FIELD(using);
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 702,707 **** _equalJoinExpr(JoinExpr *a, JoinExpr *b)
--- 702,708 ----
  {
  	COMPARE_SCALAR_FIELD(jointype);
  	COMPARE_SCALAR_FIELD(isNatural);
+ 	COMPARE_SCALAR_FIELD(isForce);
  	COMPARE_NODE_FIELD(larg);
  	COMPARE_NODE_FIELD(rarg);
  	COMPARE_NODE_FIELD(using);
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
***************
*** 1257,1262 **** _outJoinExpr(StringInfo str, JoinExpr *node)
--- 1257,1263 ----
  
  	WRITE_ENUM_FIELD(jointype, JoinType);
  	WRITE_BOOL_FIELD(isNatural);
+ 	WRITE_BOOL_FIELD(isForce);
  	WRITE_NODE_FIELD(larg);
  	WRITE_NODE_FIELD(rarg);
  	WRITE_NODE_FIELD(using);
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
***************
*** 1068,1073 **** _readJoinExpr(void)
--- 1068,1074 ----
  
  	READ_ENUM_FIELD(jointype, JoinType);
  	READ_BOOL_FIELD(isNatural);
+ 	READ_BOOL_FIELD(isForce);
  	READ_NODE_FIELD(larg);
  	READ_NODE_FIELD(rarg);
  	READ_NODE_FIELD(using);
*** a/src/backend/optimizer/README
--- b/src/backend/optimizer/README
***************
*** 90,99 **** single join relation.
  2) Normally, any explicit JOIN clauses are "flattened" so that we just
  have a list of relations to join.  However, FULL OUTER JOIN clauses are
  never flattened, and other kinds of JOIN might not be either, if the
! flattening process is stopped by join_collapse_limit or from_collapse_limit
! restrictions.  Therefore, we end up with a planning problem that contains
! lists of relations to be joined in any order, where any individual item
! might be a sub-list that has to be joined together before we can consider
  joining it to its siblings.  We process these sub-problems recursively,
  bottom up.  Note that the join list structure constrains the possible join
  orders, but it doesn't constrain the join implementation method at each
--- 90,98 ----
  2) Normally, any explicit JOIN clauses are "flattened" so that we just
  have a list of relations to join.  However, FULL OUTER JOIN clauses are
  never flattened, and other kinds of JOIN might not be either, if the
! FORCE keyword is used.  Therefore, we end up with a planning problem that
! contains lists of relations to be joined in any order, where any individual
! item might be a sub-list that has to be joined together before we can consider
  joining it to its siblings.  We process these sub-problems recursively,
  bottom up.  Note that the join list structure constrains the possible join
  orders, but it doesn't constrain the join implementation method at each
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***************
*** 32,43 ****
  #include "utils/lsyscache.h"
  #include "utils/syscache.h"
  
- 
- /* These parameters are set by GUC */
- int			from_collapse_limit;
- int			join_collapse_limit;
- 
- 
  static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
  					bool below_outer_join,
  					Relids *qualscope, Relids *inner_join_rels);
--- 32,37 ----
***************
*** 215,221 **** add_vars_to_targetlist(PlannerInfo *root, List *vars, Relids where_needed)
   * (note that legal orders may be constrained by SpecialJoinInfo nodes).
   * A sub-joinlist represents a subproblem to be planned separately. Currently
   * sub-joinlists arise only from FULL OUTER JOIN or when collapsing of
!  * subproblems is stopped by join_collapse_limit or from_collapse_limit.
   *
   * NOTE: when dealing with inner joins, it is appropriate to let a qual clause
   * be evaluated at the lowest level where all the variables it mentions are
--- 209,215 ----
   * (note that legal orders may be constrained by SpecialJoinInfo nodes).
   * A sub-joinlist represents a subproblem to be planned separately. Currently
   * sub-joinlists arise only from FULL OUTER JOIN or when collapsing of
!  * subproblems is stopped by the FORCE keyword.
   *
   * NOTE: when dealing with inner joins, it is appropriate to let a qual clause
   * be evaluated at the lowest level where all the variables it mentions are
***************
*** 284,320 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  	else if (IsA(jtnode, FromExpr))
  	{
  		FromExpr   *f = (FromExpr *) jtnode;
- 		int			remaining;
  		ListCell   *l;
  
  		/*
  		 * First, recurse to handle child joins.  We collapse subproblems into
! 		 * a single joinlist whenever the resulting joinlist wouldn't exceed
! 		 * from_collapse_limit members.  Also, always collapse one-element
! 		 * subproblems, since that won't lengthen the joinlist anyway.
  		 */
  		*qualscope = NULL;
  		*inner_join_rels = NULL;
  		joinlist = NIL;
- 		remaining = list_length(f->fromlist);
  		foreach(l, f->fromlist)
  		{
  			Relids		sub_qualscope;
  			List	   *sub_joinlist;
- 			int			sub_members;
  
  			sub_joinlist = deconstruct_recurse(root, lfirst(l),
  											   below_outer_join,
  											   &sub_qualscope,
  											   inner_join_rels);
  			*qualscope = bms_add_members(*qualscope, sub_qualscope);
! 			sub_members = list_length(sub_joinlist);
! 			remaining--;
! 			if (sub_members <= 1 ||
! 				list_length(joinlist) + sub_members + remaining <= from_collapse_limit)
! 				joinlist = list_concat(joinlist, sub_joinlist);
! 			else
! 				joinlist = lappend(joinlist, sub_joinlist);
  		}
  
  		/*
--- 278,303 ----
  	else if (IsA(jtnode, FromExpr))
  	{
  		FromExpr   *f = (FromExpr *) jtnode;
  		ListCell   *l;
  
  		/*
  		 * First, recurse to handle child joins.  We collapse subproblems into
! 		 * a single joinlist.
  		 */
  		*qualscope = NULL;
  		*inner_join_rels = NULL;
  		joinlist = NIL;
  		foreach(l, f->fromlist)
  		{
  			Relids		sub_qualscope;
  			List	   *sub_joinlist;
  
  			sub_joinlist = deconstruct_recurse(root, lfirst(l),
  											   below_outer_join,
  											   &sub_qualscope,
  											   inner_join_rels);
  			*qualscope = bms_add_members(*qualscope, sub_qualscope);
! 			joinlist = list_concat(joinlist, sub_joinlist);
  		}
  
  		/*
***************
*** 469,505 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  
  		/*
  		 * Finally, compute the output joinlist.  We fold subproblems together
! 		 * except at a FULL JOIN or where join_collapse_limit would be
! 		 * exceeded.
  		 */
! 		if (j->jointype == JOIN_FULL)
  		{
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (list_length(leftjoinlist) + list_length(rightjoinlist) <=
! 				 join_collapse_limit)
  		{
  			/* OK to combine subproblems */
  			joinlist = list_concat(leftjoinlist, rightjoinlist);
  		}
- 		else
- 		{
- 			/* can't combine, but needn't force join order above here */
- 			Node	   *leftpart,
- 					   *rightpart;
- 
- 			/* avoid creating useless 1-element sublists */
- 			if (list_length(leftjoinlist) == 1)
- 				leftpart = (Node *) linitial(leftjoinlist);
- 			else
- 				leftpart = (Node *) leftjoinlist;
- 			if (list_length(rightjoinlist) == 1)
- 				rightpart = (Node *) linitial(rightjoinlist);
- 			else
- 				rightpart = (Node *) rightjoinlist;
- 			joinlist = list_make2(leftpart, rightpart);
- 		}
  	}
  	else
  	{
--- 452,469 ----
  
  		/*
  		 * Finally, compute the output joinlist.  We fold subproblems together
! 		 * except at a FULL JOIN or where FORCE has been specified.
  		 */
! 		if (j->jointype == JOIN_FULL || j->isForce)
  		{
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else
  		{
  			/* OK to combine subproblems */
  			joinlist = list_concat(leftjoinlist, rightjoinlist);
  		}
  	}
  	else
  	{
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
***************
*** 1087,1092 **** convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
--- 1087,1093 ----
  	result = makeNode(JoinExpr);
  	result->jointype = JOIN_SEMI;
  	result->isNatural = false;
+ 	result->isForce = false;
  	result->larg = NULL;		/* caller must fill this in */
  	result->rarg = (Node *) rtr;
  	result->using = NIL;
***************
*** 1227,1232 **** convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
--- 1228,1234 ----
  	result = makeNode(JoinExpr);
  	result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;
  	result->isNatural = false;
+ 	result->isForce = false;
  	result->larg = NULL;		/* caller must fill this in */
  	/* flatten out the FromExpr node if it's useless */
  	if (list_length(subselect->jointree->fromlist) == 1)
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 7437,7459 **** joined_table:
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = JOIN_INNER;
  					n->isNatural = FALSE;
  					n->larg = $1;
  					n->rarg = $4;
  					n->using = NIL;
  					n->quals = NULL;
  					$$ = n;
  				}
! 			| table_ref join_type JOIN table_ref join_qual
  				{
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = $2;
  					n->isNatural = FALSE;
  					n->larg = $1;
! 					n->rarg = $4;
! 					if ($5 != NULL && IsA($5, List))
! 						n->using = (List *) $5; /* USING clause */
  					else
! 						n->quals = $5; /* ON clause */
  					$$ = n;
  				}
  			| table_ref JOIN table_ref join_qual
--- 7437,7461 ----
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = JOIN_INNER;
  					n->isNatural = FALSE;
+ 					n->isForce = FALSE;
  					n->larg = $1;
  					n->rarg = $4;
  					n->using = NIL;
  					n->quals = NULL;
  					$$ = n;
  				}
! 			| table_ref join_type opt_force JOIN table_ref join_qual
  				{
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = $2;
  					n->isNatural = FALSE;
+ 					n->isForce = $3;
  					n->larg = $1;
! 					n->rarg = $5;
! 					if ($6 != NULL && IsA($6, List))
! 						n->using = (List *) $6; /* USING clause */
  					else
! 						n->quals = $6; /* ON clause */
  					$$ = n;
  				}
  			| table_ref JOIN table_ref join_qual
***************
*** 7462,7467 **** joined_table:
--- 7464,7470 ----
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = JOIN_INNER;
  					n->isNatural = FALSE;
+ 					n->isForce = FALSE;
  					n->larg = $1;
  					n->rarg = $3;
  					if ($4 != NULL && IsA($4, List))
***************
*** 7475,7480 **** joined_table:
--- 7478,7484 ----
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = $3;
  					n->isNatural = TRUE;
+ 					n->isForce = FALSE;
  					n->larg = $1;
  					n->rarg = $5;
  					n->using = NIL; /* figure out which columns later... */
***************
*** 7487,7492 **** joined_table:
--- 7491,7497 ----
  					JoinExpr *n = makeNode(JoinExpr);
  					n->jointype = JOIN_INNER;
  					n->isNatural = TRUE;
+ 					n->isForce = FALSE;
  					n->larg = $1;
  					n->rarg = $4;
  					n->using = NIL; /* figure out which columns later... */
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
***************
*** 5824,5829 **** get_from_clause_item(Node *jtnode, Query *query, deparse_context *context)
--- 5824,5831 ----
  	{
  		JoinExpr   *j = (JoinExpr *) jtnode;
  		bool		need_paren_on_right;
+ 		char		buffer[64];
+ 		int			indentPlus;
  
  		need_paren_on_right = PRETTY_PAREN(context) &&
  			!IsA(j->rarg, RangeTblRef) &&
***************
*** 5834,5905 **** get_from_clause_item(Node *jtnode, Query *query, deparse_context *context)
  
  		get_from_clause_item(j->larg, query, context);
  
  		if (j->isNatural)
  		{
  			if (!PRETTY_INDENT(context))
  				appendStringInfoChar(buf, ' ');
! 			switch (j->jointype)
! 			{
! 				case JOIN_INNER:
! 					appendContextKeyword(context, "NATURAL JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 0);
! 					break;
! 				case JOIN_LEFT:
! 					appendContextKeyword(context, "NATURAL LEFT JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 0);
! 					break;
! 				case JOIN_FULL:
! 					appendContextKeyword(context, "NATURAL FULL JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 0);
! 					break;
! 				case JOIN_RIGHT:
! 					appendContextKeyword(context, "NATURAL RIGHT JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 0);
! 					break;
! 				default:
! 					elog(ERROR, "unrecognized join type: %d",
! 						 (int) j->jointype);
! 			}
  		}
  		else
  		{
! 			switch (j->jointype)
! 			{
! 				case JOIN_INNER:
! 					if (j->quals)
! 						appendContextKeyword(context, " JOIN ",
! 											 -PRETTYINDENT_JOIN,
! 											 PRETTYINDENT_JOIN, 2);
! 					else
! 						appendContextKeyword(context, " CROSS JOIN ",
! 											 -PRETTYINDENT_JOIN,
! 											 PRETTYINDENT_JOIN, 1);
! 					break;
! 				case JOIN_LEFT:
! 					appendContextKeyword(context, " LEFT JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 2);
! 					break;
! 				case JOIN_FULL:
! 					appendContextKeyword(context, " FULL JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 2);
! 					break;
! 				case JOIN_RIGHT:
! 					appendContextKeyword(context, " RIGHT JOIN ",
! 										 -PRETTYINDENT_JOIN,
! 										 PRETTYINDENT_JOIN, 2);
! 					break;
! 				default:
! 					elog(ERROR, "unrecognized join type: %d",
! 						 (int) j->jointype);
! 			}
  		}
  
  		if (need_paren_on_right)
  			appendStringInfoChar(buf, '(');
  		get_from_clause_item(j->rarg, query, context);
--- 5836,5891 ----
  
  		get_from_clause_item(j->larg, query, context);
  
+ 		buffer[0] = '\0';
  		if (j->isNatural)
  		{
  			if (!PRETTY_INDENT(context))
  				appendStringInfoChar(buf, ' ');
! 			strcat(buffer, "NATURAL ");
! 			indentPlus = 0;
  		}
  		else
  		{
! 			strcat(buffer, " ");
! 			indentPlus = 2;
! 		}
! 
! 		switch (j->jointype)
! 		{
! 			case JOIN_INNER:
! 				if (!j->quals && !j->isNatural)
! 				{
! 					strcat(buffer, "CROSS ");
! 					indentPlus = 1;
! 				}
! 				/*
! 				 * INNER is normally just decoration, but it's required when
! 				 * we also need to emit FORCE.
! 				 */
! 				if (j->isForce)
! 					strcat(buffer, "INNER ");
! 				break;
! 			case JOIN_LEFT:
! 				strcat(buffer, "LEFT ");
! 				break;
! 			case JOIN_FULL:
! 				strcat(buffer, "FULL ");
! 				break;
! 			case JOIN_RIGHT:
! 				strcat(buffer, "RIGHT ");
! 				break;
! 			default:
! 				elog(ERROR, "unrecognized join type: %d", (int) j->jointype);
  		}
  
+ 		if (j->isForce)
+ 			strcat(buffer, "FORCE JOIN ");
+ 		else
+ 			strcat(buffer, "JOIN ");
+ 
+ 		appendContextKeyword(context, buffer, -PRETTYINDENT_JOIN,
+ 							 PRETTYINDENT_JOIN, indentPlus);
+ 
  		if (need_paren_on_right)
  			appendStringInfoChar(buf, '(');
  		get_from_clause_item(j->rarg, query, context);
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 1259,1286 **** static struct config_int ConfigureNamesInt[] =
  		100, 1, 10000, NULL, NULL
  	},
  	{
- 		{"from_collapse_limit", PGC_USERSET, QUERY_TUNING_OTHER,
- 			gettext_noop("Sets the FROM-list size beyond which subqueries "
- 						 "are not collapsed."),
- 			gettext_noop("The planner will merge subqueries into upper "
- 				"queries if the resulting FROM list would have no more than "
- 						 "this many items.")
- 		},
- 		&from_collapse_limit,
- 		8, 1, INT_MAX, NULL, NULL
- 	},
- 	{
- 		{"join_collapse_limit", PGC_USERSET, QUERY_TUNING_OTHER,
- 			gettext_noop("Sets the FROM-list size beyond which JOIN "
- 						 "constructs are not flattened."),
- 			gettext_noop("The planner will flatten explicit JOIN "
- 						 "constructs into lists of FROM items whenever a "
- 						 "list of no more than this many items would result.")
- 		},
- 		&join_collapse_limit,
- 		8, 1, INT_MAX, NULL, NULL
- 	},
- 	{
  		{"geqo_threshold", PGC_USERSET, QUERY_TUNING_GEQO,
  			gettext_noop("Sets the threshold of FROM items beyond which GEQO is used."),
  			NULL
--- 1259,1264 ----
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 219,227 ****
  #default_statistics_target = 100	# range 1-10000
  #constraint_exclusion = partition	# on, off, or partition
  #cursor_tuple_fraction = 0.1		# range 0.0-1.0
- #from_collapse_limit = 8
- #join_collapse_limit = 8		# 1 disables collapsing of explicit 
- 					# JOIN clauses
  
  
  #------------------------------------------------------------------------------
--- 219,224 ----
*** a/src/include/nodes/primnodes.h
--- b/src/include/nodes/primnodes.h
***************
*** 1150,1155 **** typedef struct JoinExpr
--- 1150,1156 ----
  	NodeTag		type;
  	JoinType	jointype;		/* type of join */
  	bool		isNatural;		/* Natural join? Will need to shape table */
+ 	bool		isForce;		/* Join order forced? */
  	Node	   *larg;			/* left subtree */
  	Node	   *rarg;			/* right subtree */
  	List	   *using;			/* USING clause, if any (list of String) */
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 79,87 **** extern bool is_projection_capable_plan(Plan *plan);
  /*
   * prototypes for plan/initsplan.c
   */
- extern int	from_collapse_limit;
- extern int	join_collapse_limit;
- 
  extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
  extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
  extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
--- 79,84 ----

#51

Dimitri Fontaine

dfontaine@hi-media.com

over 16 years ago

In reply to: Robert Haas (#50)

Re: *_collapse_limit, geqo_threshold

Hi,

Le 10 juil. 09 à 17:22, Robert Haas a écrit :

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

I see you're using the following syntax:
! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =
c.id)) ON (a.id = b.id);

The only place I've seen that before is MySQL straight_join feature:
http://dev.mysql.com/doc/refman/5.0/en/join.html

My first though at the time was "what a lame optimiser if I'm to tell
it where not to reorder joins", but perhaps this was because the
option there, IIRC, could impact the results...

I guess I'm not going to like it, but nonetheless, if we're going to
support the feature, what about calling it the same as MySQL, unless
the standard has something to say?
--
dim

#52

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Robert Haas (#50)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> writes:

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

You're right, I don't. Even if I thought this were a good idea, which
I most definitely do not, the need to add a nonstandard fully-reserved
word is sufficient reason to reject it. (The patch tries to pretend
it's not going to reserve the word, but that only works because you have
carefully changed only one of the five JOIN productions, leading to
bizarrely non-orthogonal syntax.)

regards, tom lane

#53

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Dimitri Fontaine (#51)

Re: *_collapse_limit, geqo_threshold

On Jul 10, 2009, at 11:48 AM, Dimitri Fontaine <dfontaine@hi-
media.com> wrote:

Hi,

Le 10 juil. 09 à 17:22, Robert Haas a écrit :

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

I see you're using the following syntax:
! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =
c.id)) ON (a.id = b.id);

The only place I've seen that before is MySQL straight_join feature:
http://dev.mysql.com/doc/refman/5.0/en/join.html

My first though at the time was "what a lame optimiser if I'm to
tell it where not to reorder joins", but perhaps this was because
the option there, IIRC, could impact the results...

I guess I'm not going to like it, but nonetheless, if we're going to
support the feature, what about calling it the same as MySQL, unless
the standard has something to say?

Well, I think we would be well-advised to get past the threshold issue
of whether or not the overall design sucks before quibbling over
details like my particular choice of keyword. That having been said,
if the MySQL feature to which you linked actually does the same thing
as what I implemented here, then it's amazingly poorly documented. We
certainly don't guarantee anything about the order in which the input
tables are read; I'd ask what that even means except I don't care.
We're only making a promise about where that join will be implemented
in the plan tree as compared with *other joins*.

It was reported upthread that Oracle uses ORDERED for this but I don't
know whether either the syntax or the semantics match what I did
here. At any rate the particular choice of keyword seems rather
insignificant; I picked it because it was already a keyword and seemed
vaguely appropriate, but it could easily be changed.

...Robert

#54

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Robert Haas (#53)

Re: *_collapse_limit, geqo_threshold

Robert Haas <robertmhaas@gmail.com> wrote:

At any rate the particular choice of keyword seems rather
insignificant; I picked it because it was already a keyword and
seemed vaguely appropriate, but it could easily be changed.

Actually, if we were going to add fine-grained optimizer hints for
this (which I'm not at all convinced is a good idea), I'd be tempted
to go with what I saw a few years ago in SAP-DB (later rebranded as
MySQL Max-DB): treat parentheses around JOIN operations as optimizer
hints. I would only consider this as remotely sane if there was an
enable_* GUC to disable the normal reordering of joins. It introduces
no new keywords. The queries are still standard compliant. We would
just have a configuration option which treats an optional part of the
standard syntax as a hint.

In other words:

select * from (t1 join t2 on <whatever>) join t3 on <whatever>;

would limit optimizer choice from those available with:

select * from t1 join t2 on <whatever> join t3 on <whatever>;

-Kevin

#55

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Kevin Grittner (#54)

Re: *_collapse_limit, geqo_threshold

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

Actually, if we were going to add fine-grained optimizer hints for
this (which I'm not at all convinced is a good idea), I'd be tempted
to go with what I saw a few years ago in SAP-DB (later rebranded as
MySQL Max-DB): treat parentheses around JOIN operations as optimizer
hints.

That's a *truly* horrid idea, as sometimes you need them simply to
get the precedence correct. Such awful design from SAP doesn't surprise
me, and MySQL copying a bad idea surprises me even less, but let's not
go there.

regards, tom lane

#56

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 16 years ago

In reply to: Tom Lane (#55)

Re: *_collapse_limit, geqo_threshold

Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

treat parentheses around JOIN operations as optimizer hints.

That's a *truly* horrid idea, as sometimes you need them simply to
get the precedence correct.

You do, but it's been pretty rare in my experience, and we're
considering alternatives which give a lot less flexibility that this.

The *truly* awful thing about the SAP-DB implementation is that it
wasn't optional -- parentheses in this part of a query always limited
optimizer options; I sure wouldn't want to go there again. I thought
we were talking about options for what to do when an enable_* setting
was off for diagnostic purposes....

-Kevin

#57

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#52)

Re: *_collapse_limit, geqo_threshold

On Jul 10, 2009, at 12:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

You're right, I don't. Even if I thought this were a good idea, which
I most definitely do not, the need to add a nonstandard fully-reserved
word is sufficient reason to reject it. (The patch tries to pretend
it's not going to reserve the word, but that only works because you
have
carefully changed only one of the five JOIN productions, leading to
bizarrely non-orthogonal syntax.)

Well, it's pretty obvious that only one of those productions is
actually a problem, and that is the one which produces an undecorated
JOIN. The others could all be changed easily enough, but they add no
expressive power, so I didn't think it very worthwhile to add MORE non-
standard syntax. In any event, the current non-orthogonality is
exponentially more bizarre: you can constrain the join order by
setting join_collapse_limit to 1, but then you must write the joins
you want constrained using JOIN and the others as FROM items, which of
course doesn't work at all for LEFT or RIGHT joins and will have
random and unpredictable effects on subqueries pulled in via VIEWs.
Needing to write INNER to be able to use FORCE pales by comparison.

That having been said, I'm not excited about pushing water up a hill.
The important thing here is to remove the collapse limits; providing a
tool to control the join order that won't make you want to beat your
head in a brick is just something that can be trivially done with no
extra code, not my primary goal.

...Robert

#58

Jaime Casanova

jcasanov@systemguards.com.ec

over 16 years ago

In reply to: Robert Haas (#50)

Re: *_collapse_limit, geqo_threshold

On Fri, Jul 10, 2009 at 10:22 AM, Robert Haas<robertmhaas@gmail.com> wrote:

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

Patch attached.

! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =
c.id)) ON (a.id = b.id);

what's next? FORCE INDEX?
once we implement one hint like this the complaints about the others
will lose force

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#59

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Kevin Grittner (#56)

Re: *_collapse_limit, geqo_threshold

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

You do, but it's been pretty rare in my experience, and we're
considering alternatives which give a lot less flexibility that this.

I dunno about "considering". We've already wasted vastly more time on
this than it's worth. AFAIR there has never been one single user
request for the ability to partially constrain join order. I think we
should do an enable_join_ordering boolean and quit wasting brainpower on
the issue.

regards, tom lane

#60

Ron Mayer

rm_pg@cheapcomplexdevices.com

over 16 years ago

In reply to: Tom Lane (#59)

Re: *_collapse_limit, geqo_threshold

Tom Lane wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

You do, but it's been pretty rare in my experience, and we're
considering alternatives which give a lot less flexibility that this.

I dunno about "considering". We've already wasted vastly more time on
this than it's worth. AFAIR there has never been one single user
request for the ability to partially constrain join order. I think we
should do an enable_join_ordering boolean and quit wasting brainpower on
the issue.

I think I've found it useful in the past[1]http://archives.postgresql.org/pgsql-performance/2007-12/msg00088.php, but I also think we
already have a way to give postgres such hints using subselects
and "offset 0".

Instead of SAP-DB's

select * from (t1 join t2 on <whatever>) join t3 on <whatever>;

ISTM we can already do

select * from (select t1 join t2 on <whatever> offset 0) as a join t3 on <whatever>;

which seems like a reasonably way of hinting which parenthesis
can be reordered and which can't.

Would these new proposals give (guc's or syntax hacks) anything that
I can't already do?

[1]: http://archives.postgresql.org/pgsql-performance/2007-12/msg00088.php

#61

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Jaime Casanova (#58)

Re: *_collapse_limit, geqo_threshold

On Fri, Jul 10, 2009 at 2:44 PM, Jaime
Casanova<jcasanov@systemguards.com.ec> wrote:

On Fri, Jul 10, 2009 at 10:22 AM, Robert Haas<robertmhaas@gmail.com> wrote:

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c). Of
course you still don't have to like it. :-)

Patch attached.

! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =
c.id)) ON (a.id = b.id);

what's next? FORCE INDEX?
once we implement one hint like this the complaints about the others
will lose force

That would be pretty useless, because while it might work for trivial
cases, in a complex plan, it's almost certainly not going to do what
you want unless you specify every detail of the entire plan along with
it, which kind of defeats the purpose of using a database with a
sophisticated planner. Actually, it seems to me that if we were to
add hints, the place to start would be with selectivity, since in my
experience bad selectivity estimates (often by 3, 4, 5, or 6 orders of
magnitude) are often the reason for a bad plan, and if you can fix
that, the rest takes care of itself - but fixing it is often very
difficult. Of course, improving the selectivity estimator would be
even better, because time spent applying hints to queries is time that
could be better spent doing something else, and in fact one of the
reasons why I started using PostgreSQL is because complex queries Just
Work.

Constraining the join order does seem a lot less useful, because (for
example) it won't compel the planner to use a hash join instead of a
nest join, or to make a sensible decision about which relations should
be on the inner side of the join. But I still think that it's worth
considering, because:

1. We basically already support the functionality - it's just broken.
Today, you can control the join order by setting join_collapse_limit =
1; then, the joins you want to order you specify using JOIN syntax;
the ones you don't want to order, you write as FROM items. But all of
your outer joins will necessarily have the order forced, because
there's no way to write those as FROM items. And if you want to use a
view that was written without this convention in mind, you're hosed.
So I think we either ought to remove it or fix it.

2. It's actually LESS code to support it completely than it is to
leave it the way it is now while removing from_collapse_limit and
replacing join_collapse_limit with enable_join_ordering. Compare the
patch I sent earlier with the one that I'm about to send.

3. The point of this particular type of hint, at least as I see it, is
not so much to force the planner into the correct query plan as it is
to constrain the planning time. There are very few tools available
for this today: join_collapse_limit and from_collapse_limit do it by
unexpectedly producing terrible plans, and the only other option is
GEQO. In a case like the one posted upthread where you have ten or so
joins that can be reordered almost arbitrarily, you could shove a
single FORCE right into the middle and split it into two problems that
can be planned consecutively. This isn't that different from what
join_collapse_limit does today, but it doesn't force a uniform
threshold on you across the board; you can apply where, when, and to
the extent necessary to control planning time for a particular query,
and you have to explicitly ask for it so you can't complain you were
ambushed if it doesn't work out.

But I can see I'm outvoted, so I give up!

...Robert

#62

Robert Haas

robertmhaas@gmail.com

over 16 years ago

In reply to: Tom Lane (#59)

1 attachment(s)

Re: *_collapse_limit, geqo_threshold

On Fri, Jul 10, 2009 at 2:48 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

You do, but it's been pretty rare in my experience, and we're
considering alternatives which give a lot less flexibility that this.

I dunno about "considering". We've already wasted vastly more time on
this than it's worth. AFAIR there has never been one single user
request for the ability to partially constrain join order. I think we
should do an enable_join_ordering boolean and quit wasting brainpower on
the issue.

Patch attached.

...Robert

Attachments:

enable_join_ordering.patchtext/x-diff; charset=US-ASCII; name=enable_join_ordering.patchDownload

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 1798,1803 **** archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
--- 1798,1823 ----
        </listitem>
       </varlistentry>
  
+      <varlistentry id="guc-enable-join-ordering" xreflabel="enable_join_ordering">
+       <term><varname>enable_join_ordering</varname> (<type>boolean</type>)</term>
+       <indexterm>
+        <primary><varname>enable_join_ordering</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Allows the planner to reorder joins, which is generally appropriate
+         for most uses. Setting it to false prevents any reordering of
+         explicit <literal>JOIN</>s. Thus, the explicit join order
+         specified in the query will be the actual order in which the
+         relations are joined. The query planner does not always choose
+         the optimal join order; advanced users can elect to
+         temporarily set this variable to false, and then specify the join
+         order they desire explicitly.
+         For more information see <xref linkend="explicit-joins">.
+        </para>
+       </listitem>
+      </varlistentry>
+ 
       <varlistentry id="guc-enable-mergejoin" xreflabel="enable_mergejoin">
        <term><varname>enable_mergejoin</varname> (<type>boolean</type>)</term>
        <indexterm>
***************
*** 2251,2313 **** SELECT * FROM parent WHERE key = 2400;
        </listitem>
       </varlistentry>
  
-      <varlistentry id="guc-from-collapse-limit" xreflabel="from_collapse_limit">
-       <term><varname>from_collapse_limit</varname> (<type>integer</type>)</term>
-       <indexterm>
-        <primary><varname>from_collapse_limit</> configuration parameter</primary>
-       </indexterm>
-       <listitem>
-        <para>
-         The planner will merge sub-queries into upper queries if the
-         resulting <literal>FROM</literal> list would have no more than
-         this many items.  Smaller values reduce planning time but might
-         yield inferior query plans.  The default is eight.
-         For more information see <xref linkend="explicit-joins">.
-        </para>
- 
-        <para>
-         Setting this value to <xref linkend="guc-geqo-threshold"> or more
-         may trigger use of the GEQO planner, resulting in nondeterministic
-         plans.  See <xref linkend="runtime-config-query-geqo">.
-        </para>
-       </listitem>
-      </varlistentry>
- 
-      <varlistentry id="guc-join-collapse-limit" xreflabel="join_collapse_limit">
-       <term><varname>join_collapse_limit</varname> (<type>integer</type>)</term>
-       <indexterm>
-        <primary><varname>join_collapse_limit</> configuration parameter</primary>
-       </indexterm>
-       <listitem>
-        <para>
-         The planner will rewrite explicit <literal>JOIN</>
-         constructs (except <literal>FULL JOIN</>s) into lists of
-         <literal>FROM</> items whenever a list of no more than this many items
-         would result.  Smaller values reduce planning time but might
-         yield inferior query plans.
-        </para>
- 
-        <para>
-         By default, this variable is set the same as
-         <varname>from_collapse_limit</varname>, which is appropriate
-         for most uses. Setting it to 1 prevents any reordering of
-         explicit <literal>JOIN</>s. Thus, the explicit join order
-         specified in the query will be the actual order in which the
-         relations are joined. The query planner does not always choose
-         the optimal join order; advanced users can elect to
-         temporarily set this variable to 1, and then specify the join
-         order they desire explicitly.
-         For more information see <xref linkend="explicit-joins">.
-        </para>
- 
-        <para>
-         Setting this value to <xref linkend="guc-geqo-threshold"> or more
-         may trigger use of the GEQO planner, resulting in nondeterministic
-         plans.  See <xref linkend="runtime-config-query-geqo">.
-        </para>
-       </listitem>
-      </varlistentry>
- 
       </variablelist>
      </sect2>
     </sect1>
--- 2271,2276 ----
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***************
*** 692,699 **** SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
    <para>
     To force the planner to follow the join order laid out by explicit
     <literal>JOIN</>s,
!    set the <xref linkend="guc-join-collapse-limit"> run-time parameter to 1.
!    (Other possible values are discussed below.)
    </para>
  
    <para>
--- 692,699 ----
    <para>
     To force the planner to follow the join order laid out by explicit
     <literal>JOIN</>s,
!    set the <xref linkend="guc-enable-join-ordering"> run-time parameter to
!    false.
    </para>
  
    <para>
***************
*** 703,710 **** SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
  <programlisting>
  SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
  </programlisting>
!    With <varname>join_collapse_limit</> = 1, this
!    forces the planner to join A to B before joining them to other tables,
     but doesn't constrain its choices otherwise.  In this example, the
     number of possible join orders is reduced by a factor of 5.
    </para>
--- 703,709 ----
  <programlisting>
  SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
  </programlisting>
!    This forces the planner to join A to B before joining them to other tables,
     but doesn't constrain its choices otherwise.  In this example, the
     number of possible join orders is reduced by a factor of 5.
    </para>
***************
*** 741,765 **** SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
     we have increased the planning time; here, we have a five-way join
     problem replacing two separate three-way join problems.  Because of the
     exponential growth of the number of possibilities, this makes a big
!    difference.  The planner tries to avoid getting stuck in huge join search
!    problems by not collapsing a subquery if more than <varname>from_collapse_limit</>
!    <literal>FROM</> items would result in the parent
!    query.  You can trade off planning time against quality of plan by
!    adjusting this run-time parameter up or down.
!   </para>
! 
!   <para>
!    <xref linkend="guc-from-collapse-limit"> and <xref
!    linkend="guc-join-collapse-limit">
!    are similarly named because they do almost the same thing: one controls
!    when the planner will <quote>flatten out</> subqueries, and the
!    other controls when it will flatten out explicit joins.  Typically
!    you would either set <varname>join_collapse_limit</> equal to
!    <varname>from_collapse_limit</> (so that explicit joins and subqueries
!    act similarly) or set <varname>join_collapse_limit</> to 1 (if you want
!    to control join order with explicit joins).  But you might set them
!    differently if you are trying to fine-tune the trade-off between planning
!    time and run time.
    </para>
   </sect1>
  
--- 740,746 ----
     we have increased the planning time; here, we have a five-way join
     problem replacing two separate three-way join problems.  Because of the
     exponential growth of the number of possibilities, this makes a big
!    difference.
    </para>
   </sect1>
  
*** a/src/backend/optimizer/README
--- b/src/backend/optimizer/README
***************
*** 89,106 **** single join relation.
  
  2) Normally, any explicit JOIN clauses are "flattened" so that we just
  have a list of relations to join.  However, FULL OUTER JOIN clauses are
! never flattened, and other kinds of JOIN might not be either, if the
! flattening process is stopped by join_collapse_limit or from_collapse_limit
! restrictions.  Therefore, we end up with a planning problem that contains
! lists of relations to be joined in any order, where any individual item
! might be a sub-list that has to be joined together before we can consider
! joining it to its siblings.  We process these sub-problems recursively,
! bottom up.  Note that the join list structure constrains the possible join
! orders, but it doesn't constrain the join implementation method at each
! join (nestloop, merge, hash), nor does it say which rel is considered outer
! or inner at each join.  We consider all these possibilities in building
! Paths. We generate a Path for each feasible join method, and select the
! cheapest Path.
  
  For each planning problem, therefore, we will have a list of relations
  that are either base rels or joinrels constructed per sub-join-lists.
--- 89,105 ----
  
  2) Normally, any explicit JOIN clauses are "flattened" so that we just
  have a list of relations to join.  However, FULL OUTER JOIN clauses are
! never flattened and other kinds of JOIN might not be either, if
! enable_join_ordering is set to false.  Therefore, we end up with a planning
! problem that contains lists of relations to be joined in any order, where
! any individual item might be a sub-list that has to be joined together
! before we can consider joining it to its siblings.  We process these
! sub-problems recursively, bottom up.  Note that the join list structure
! constrains the possible join orders, but it doesn't constrain the join
! implementation method at each join (nestloop, merge, hash), nor does it
! say which rel is considered outer or inner at each join.  We consider
! all these possibilities in building Paths. We generate a Path for each
! feasible join method, and select the cheapest Path.
  
  For each planning problem, therefore, we will have a list of relations
  that are either base rels or joinrels constructed per sub-join-lists.
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***************
*** 33,41 ****
  #include "utils/syscache.h"
  
  
! /* These parameters are set by GUC */
! int			from_collapse_limit;
! int			join_collapse_limit;
  
  
  static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
--- 33,40 ----
  #include "utils/syscache.h"
  
  
! /* This parameter is set by GUC */
! bool		enable_join_ordering;
  
  
  static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
***************
*** 214,221 **** add_vars_to_targetlist(PlannerInfo *root, List *vars, Relids where_needed)
   * joinlist must be joined in an order to be determined by make_one_rel()
   * (note that legal orders may be constrained by SpecialJoinInfo nodes).
   * A sub-joinlist represents a subproblem to be planned separately. Currently
!  * sub-joinlists arise only from FULL OUTER JOIN or when collapsing of
!  * subproblems is stopped by join_collapse_limit or from_collapse_limit.
   *
   * NOTE: when dealing with inner joins, it is appropriate to let a qual clause
   * be evaluated at the lowest level where all the variables it mentions are
--- 213,220 ----
   * joinlist must be joined in an order to be determined by make_one_rel()
   * (note that legal orders may be constrained by SpecialJoinInfo nodes).
   * A sub-joinlist represents a subproblem to be planned separately. Currently
!  * sub-joinlists arise only from FULL OUTER JOIN or when enable_join_ordering
!  * is set to false.
   *
   * NOTE: when dealing with inner joins, it is appropriate to let a qual clause
   * be evaluated at the lowest level where all the variables it mentions are
***************
*** 284,302 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  	else if (IsA(jtnode, FromExpr))
  	{
  		FromExpr   *f = (FromExpr *) jtnode;
- 		int			remaining;
  		ListCell   *l;
  
  		/*
  		 * First, recurse to handle child joins.  We collapse subproblems into
! 		 * a single joinlist whenever the resulting joinlist wouldn't exceed
! 		 * from_collapse_limit members.  Also, always collapse one-element
! 		 * subproblems, since that won't lengthen the joinlist anyway.
  		 */
  		*qualscope = NULL;
  		*inner_join_rels = NULL;
  		joinlist = NIL;
- 		remaining = list_length(f->fromlist);
  		foreach(l, f->fromlist)
  		{
  			Relids		sub_qualscope;
--- 283,297 ----
  	else if (IsA(jtnode, FromExpr))
  	{
  		FromExpr   *f = (FromExpr *) jtnode;
  		ListCell   *l;
  
  		/*
  		 * First, recurse to handle child joins.  We collapse subproblems into
! 		 * a single joinlist.
  		 */
  		*qualscope = NULL;
  		*inner_join_rels = NULL;
  		joinlist = NIL;
  		foreach(l, f->fromlist)
  		{
  			Relids		sub_qualscope;
***************
*** 309,320 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  											   inner_join_rels);
  			*qualscope = bms_add_members(*qualscope, sub_qualscope);
  			sub_members = list_length(sub_joinlist);
! 			remaining--;
! 			if (sub_members <= 1 ||
! 				list_length(joinlist) + sub_members + remaining <= from_collapse_limit)
! 				joinlist = list_concat(joinlist, sub_joinlist);
! 			else
! 				joinlist = lappend(joinlist, sub_joinlist);
  		}
  
  		/*
--- 304,310 ----
  											   inner_join_rels);
  			*qualscope = bms_add_members(*qualscope, sub_qualscope);
  			sub_members = list_length(sub_joinlist);
! 			joinlist = list_concat(joinlist, sub_joinlist);
  		}
  
  		/*
***************
*** 469,484 **** deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  
  		/*
  		 * Finally, compute the output joinlist.  We fold subproblems together
! 		 * except at a FULL JOIN or where join_collapse_limit would be
! 		 * exceeded.
  		 */
  		if (j->jointype == JOIN_FULL)
  		{
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (list_length(leftjoinlist) + list_length(rightjoinlist) <=
! 				 join_collapse_limit)
  		{
  			/* OK to combine subproblems */
  			joinlist = list_concat(leftjoinlist, rightjoinlist);
--- 459,472 ----
  
  		/*
  		 * Finally, compute the output joinlist.  We fold subproblems together
! 		 * except at a FULL JOIN or when enable_join_ordering is false.
  		 */
  		if (j->jointype == JOIN_FULL)
  		{
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (enable_join_ordering)
  		{
  			/* OK to combine subproblems */
  			joinlist = list_concat(leftjoinlist, rightjoinlist);
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 656,661 **** static struct config_bool ConfigureNamesBool[] =
--- 656,669 ----
  		true, NULL, NULL
  	},
  	{
+ 		{"enable_join_ordering", PGC_USERSET, QUERY_TUNING_METHOD,
+ 			gettext_noop("Enables the planner to reorder explicit joins."),
+ 			NULL
+ 		},
+ 		&enable_join_ordering,
+ 		true, NULL, NULL
+ 	},
+ 	{
  		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
  			gettext_noop("Enables genetic query optimization."),
  			gettext_noop("This algorithm attempts to do planning without "
***************
*** 1259,1286 **** static struct config_int ConfigureNamesInt[] =
  		100, 1, 10000, NULL, NULL
  	},
  	{
- 		{"from_collapse_limit", PGC_USERSET, QUERY_TUNING_OTHER,
- 			gettext_noop("Sets the FROM-list size beyond which subqueries "
- 						 "are not collapsed."),
- 			gettext_noop("The planner will merge subqueries into upper "
- 				"queries if the resulting FROM list would have no more than "
- 						 "this many items.")
- 		},
- 		&from_collapse_limit,
- 		8, 1, INT_MAX, NULL, NULL
- 	},
- 	{
- 		{"join_collapse_limit", PGC_USERSET, QUERY_TUNING_OTHER,
- 			gettext_noop("Sets the FROM-list size beyond which JOIN "
- 						 "constructs are not flattened."),
- 			gettext_noop("The planner will flatten explicit JOIN "
- 						 "constructs into lists of FROM items whenever a "
- 						 "list of no more than this many items would result.")
- 		},
- 		&join_collapse_limit,
- 		8, 1, INT_MAX, NULL, NULL
- 	},
- 	{
  		{"geqo_threshold", PGC_USERSET, QUERY_TUNING_GEQO,
  			gettext_noop("Sets the threshold of FROM items beyond which GEQO is used."),
  			NULL
--- 1267,1272 ----
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 190,195 ****
--- 190,196 ----
  #enable_hashagg = on
  #enable_hashjoin = on
  #enable_indexscan = on
+ #enable_join_reordering = on
  #enable_mergejoin = on
  #enable_nestloop = on
  #enable_seqscan = on
***************
*** 219,227 ****
  #default_statistics_target = 100	# range 1-10000
  #constraint_exclusion = partition	# on, off, or partition
  #cursor_tuple_fraction = 0.1		# range 0.0-1.0
- #from_collapse_limit = 8
- #join_collapse_limit = 8		# 1 disables collapsing of explicit 
- 					# JOIN clauses
  
  
  #------------------------------------------------------------------------------
--- 220,225 ----
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
***************
*** 79,86 **** extern bool is_projection_capable_plan(Plan *plan);
  /*
   * prototypes for plan/initsplan.c
   */
! extern int	from_collapse_limit;
! extern int	join_collapse_limit;
  
  extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
  extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
--- 79,85 ----
  /*
   * prototypes for plan/initsplan.c
   */
! extern bool	enable_join_ordering;
  
  extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
  extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
*** a/src/test/regress/expected/rangefuncs.out
--- b/src/test/regress/expected/rangefuncs.out
***************
*** 1,16 ****
  SELECT name, setting FROM pg_settings WHERE name LIKE 'enable%';
!        name        | setting 
! -------------------+---------
!  enable_bitmapscan | on
!  enable_hashagg    | on
!  enable_hashjoin   | on
!  enable_indexscan  | on
!  enable_mergejoin  | on
!  enable_nestloop   | on
!  enable_seqscan    | on
!  enable_sort       | on
!  enable_tidscan    | on
! (9 rows)
  
  CREATE TABLE foo2(fooid int, f2 int);
  INSERT INTO foo2 VALUES(1, 11);
--- 1,17 ----
  SELECT name, setting FROM pg_settings WHERE name LIKE 'enable%';
!          name         | setting 
! ----------------------+---------
!  enable_bitmapscan    | on
!  enable_hashagg       | on
!  enable_hashjoin      | on
!  enable_indexscan     | on
!  enable_join_ordering | on
!  enable_mergejoin     | on
!  enable_nestloop      | on
!  enable_seqscan       | on
!  enable_sort          | on
!  enable_tidscan       | on
! (10 rows)
  
  CREATE TABLE foo2(fooid int, f2 int);
  INSERT INTO foo2 VALUES(1, 11);

#63

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#34)

Re: *_collapse_limit, geqo_threshold

On Wednesday 08 July 2009 23:46:02 Tom Lane wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea....

If memory serves, we actually had exactly that at some point. But I
think the reason it got taken out was that it interfered with the
behavior of the random() function for everything else. We'd have to
give GEQO its own private random number generator.

All of GEQOs usage of random() seems to be concentrated to geqo_random.h - so
it would be a small change.
I will happily tackle that. If only to narrow down in which cases geqo fails
to plan - a behaviour we have seen at times at a bit more crazy queries.

The only question I have is, whether random_r or similar is available on
enough platforms... Has anybody an idea about this?
On most unixoid system one could just wrap erand48() if random_r is not
available.
Windows?

Andres

#64

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Andres Freund (#63)

Re: *_collapse_limit, geqo_threshold

Andres Freund <andres@anarazel.de> writes:

The only question I have is, whether random_r or similar is available on
enough platforms... Has anybody an idea about this?
On most unixoid system one could just wrap erand48() if random_r is not
available.
Windows?

random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
it on HPUX 10.20, so I'd vote against depending on it. What I do see
in SUS is initstate() and setstate() which could be used to control
the random() function:
http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html
It would also work to leave random() for use by the core code and have
GEQO depend on something from the drand48() family:
http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
Probably drand48() is less random than random(), but for the limited
purposes of GEQO I doubt we care very much.

So far as I can find in a quick google search, neither of these families
of functions exist on Windows :-(. So I think maybe the best approach
is the second one --- we could implement a port/ module that provides a
version of whichever drand48 function we need.

On reflection I think the best user API is probably a "geqo_seed" GUC in
the range 0 to 1, and have GEQO always reset its seed to that value at
start of a planning cycle. This ensures plan stability, and if you need
to experiment with alternative plans you can change to different seed
values. The "no reset" behavior doesn't seem to have much real-world
usefulness, because even if you chance to get a good plan, you have no
way to reproduce it...

regards, tom lane

#65

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#64)

Re: *_collapse_limit, geqo_threshold

Hi,

On Saturday 11 July 2009 18:23:59 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

The only question I have is, whether random_r or similar is available on
enough platforms... Has anybody an idea about this?
On most unixoid system one could just wrap erand48() if random_r is not
available.
Windows?

random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
it on HPUX 10.20, so I'd vote against depending on it. What I do see
in SUS is initstate() and setstate() which could be used to control
the random() function:
http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html

Using setstate() has a bit too many possible implications for my taste -
especially as there is no way to portably get/save the current random state.

(Running a known query to reset randoms seed or so)

It would also work to leave random() for use by the core code and have
GEQO depend on something from the drand48() family:
http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
Probably drand48() is less random than random(), but for the limited
purposes of GEQO I doubt we care very much.

Yes.

So far as I can find in a quick google search, neither of these families
of functions exist on Windows :-(. So I think maybe the best approach
is the second one --- we could implement a port/ module that provides a
version of whichever drand48 function we need.

Okay.
It would be possible to implement random_r the same way if we are going to
write something for windows anyway - Is it possible that it might be usefull
somewhere else?

On reflection I think the best user API is probably a "geqo_seed" GUC in
the range 0 to 1, and have GEQO always reset its seed to that value at
start of a planning cycle. This ensures plan stability, and if you need
to experiment with alternative plans you can change to different seed
values. The "no reset" behavior doesn't seem to have much real-world
usefulness, because even if you chance to get a good plan, you have no
way to reproduce it...

That was my thinking as well.

Should geqo_seed be documented from start or not?

Andres

#66

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Andres Freund (#65)

Re: *_collapse_limit, geqo_threshold

Andres Freund <andres@anarazel.de> writes:

On Saturday 11 July 2009 18:23:59 Tom Lane wrote:

So far as I can find in a quick google search, neither of these families
of functions exist on Windows :-(. So I think maybe the best approach
is the second one --- we could implement a port/ module that provides a
version of whichever drand48 function we need.

It would be possible to implement random_r the same way if we are going to
write something for windows anyway - Is it possible that it might be usefull
somewhere else?

I think a decent version of random_r would be more work than it's worth.

(In practice we'd probably just lift the module from one of the BSDen
anyway, so maybe "work" is the wrong measure here, but code size is
still relevant.)

regards, tom lane

#67

Kenneth Marshall

ktm@rice.edu

over 16 years ago

In reply to: Tom Lane (#64)

Re: *_collapse_limit, geqo_threshold

On Sat, Jul 11, 2009 at 12:23:59PM -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

The only question I have is, whether random_r or similar is available on
enough platforms... Has anybody an idea about this?
On most unixoid system one could just wrap erand48() if random_r is not
available.
Windows?

random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
it on HPUX 10.20, so I'd vote against depending on it. What I do see
in SUS is initstate() and setstate() which could be used to control
the random() function:
http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html
It would also work to leave random() for use by the core code and have
GEQO depend on something from the drand48() family:
http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
Probably drand48() is less random than random(), but for the limited
purposes of GEQO I doubt we care very much.

Ugh, tracking down problems caused a poor random number generator
is a difficult. Poor randomness often causes weird results from
algorithms that were designed around the assumption of a "random"
number.

So far as I can find in a quick google search, neither of these families
of functions exist on Windows :-(. So I think maybe the best approach
is the second one --- we could implement a port/ module that provides a
version of whichever drand48 function we need.

I think that having a port/module for a random number generator is a
good idea. There are a number of good, fast random number generators
to choose from.

Cheers,
Ken

Show quoted text

On reflection I think the best user API is probably a "geqo_seed" GUC in
the range 0 to 1, and have GEQO always reset its seed to that value at
start of a planning cycle. This ensures plan stability, and if you need
to experiment with alternative plans you can change to different seed
values. The "no reset" behavior doesn't seem to have much real-world
usefulness, because even if you chance to get a good plan, you have no
way to reproduce it...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#68

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#64)

Re: *_collapse_limit, geqo_threshold

On Saturday 11 July 2009 18:23:59 Tom Lane wrote:

On reflection I think the best user API is probably a "geqo_seed" GUC in
the range 0 to 1, and have GEQO always reset its seed to that value at
start of a planning cycle. This ensures plan stability, and if you need
to experiment with alternative plans you can change to different seed
values. The "no reset" behavior doesn't seem to have much real-world
usefulness, because even if you chance to get a good plan, you have no
way to reproduce it...

I just realized doing it in a naive way (in geqo()) causes the state to be
reset multiple times during one query- at every invocation of
make_rel_from_joinlist.
Does anybody see a problem with that?

Andres

#69

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Andres Freund (#68)

Re: *_collapse_limit, geqo_threshold

Andres Freund <andres@anarazel.de> writes:

I just realized doing it in a naive way (in geqo()) causes the state to be
reset multiple times during one query- at every invocation of
make_rel_from_joinlist.

Does anybody see a problem with that?

I think that's probably what we want. Otherwise, you'd have a situation
where two identical subproblems might get planned differently.

This ties into something I was thinking about earlier: the planner is
potentially recursive (eg, it might call a user-defined function that
contains a plannable query), and it'd probably be a good idea if the
behavior of GEQO wasn't sensitive to that. So the RNG's state needs to
be kept in PlannerInfo or some associated structure, not be global.

regards, tom lane

#70

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#69)

Re: *_collapse_limit, geqo_threshold

Hi Tom,

On Saturday 11 July 2009 20:33:14 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I just realized doing it in a naive way (in geqo()) causes the state to
be reset multiple times during one query- at every invocation of
make_rel_from_joinlist.

Does anybody see a problem with that?

I think that's probably what we want. Otherwise, you'd have a situation
where two identical subproblems might get planned differently.

This ties into something I was thinking about earlier: the planner is
potentially recursive (eg, it might call a user-defined function that
contains a plannable query), and it'd probably be a good idea if the
behavior of GEQO wasn't sensitive to that. So the RNG's state needs to
be kept in PlannerInfo or some associated structure, not be global.

Hm. I looked a bit into this and I dont see a real problem with a global
random state if that one gets reinitialized on every geqo() invocation. If I
understood the code correctly - which is not sure at all - while
make_rel_from_joinlist is itself recursively the actual join search code is
not recursive. Correct?
Thus it would be enough to reset the seed on every geqo() invocation...

Any counter arguments?

Andres

#71

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Andres Freund (#70)

Re: *_collapse_limit, geqo_threshold

Andres Freund <andres@anarazel.de> writes:

On Saturday 11 July 2009 20:33:14 Tom Lane wrote:

This ties into something I was thinking about earlier: the planner is
potentially recursive (eg, it might call a user-defined function that
contains a plannable query), and it'd probably be a good idea if the
behavior of GEQO wasn't sensitive to that. So the RNG's state needs to
be kept in PlannerInfo or some associated structure, not be global.

Hm. I looked a bit into this and I dont see a real problem with a global
random state if that one gets reinitialized on every geqo() invocation. If I
understood the code correctly - which is not sure at all - while
make_rel_from_joinlist is itself recursively the actual join search code is
not recursive. Correct?

I wouldn't count on that. GEQO is not recursive with respect to a
particular query, but there's still the risk of the planner deciding
to call out to some user-defined code while it does selectivity
estimates. So the planner as a whole has to be re-entrant.

Now you could probably argue that the impact of extra RNG resets on
the overall behavior is small enough that it doesn't matter. But if
you really want to promise consistent GEQO behavior then I think the
RNG state has to be local to a particular planner instantiation.

regards, tom lane

#72

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Tom Lane (#71)

Re: *_collapse_limit, geqo_threshold

On Sunday 12 July 2009 16:44:59 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Saturday 11 July 2009 20:33:14 Tom Lane wrote:

This ties into something I was thinking about earlier: the planner is
potentially recursive (eg, it might call a user-defined function that
contains a plannable query), and it'd probably be a good idea if the
behavior of GEQO wasn't sensitive to that. So the RNG's state needs to
be kept in PlannerInfo or some associated structure, not be global.

Hm. I looked a bit into this and I dont see a real problem with a global
random state if that one gets reinitialized on every geqo() invocation.
If I understood the code correctly - which is not sure at all - while
make_rel_from_joinlist is itself recursively the actual join search code
is not recursive. Correct?

I wouldn't count on that. GEQO is not recursive with respect to a
particular query, but there's still the risk of the planner deciding
to call out to some user-defined code while it does selectivity
estimates. So the planner as a whole has to be re-entrant.

Now you could probably argue that the impact of extra RNG resets on
the overall behavior is small enough that it doesn't matter. But if
you really want to promise consistent GEQO behavior then I think the
RNG state has to be local to a particular planner instantiation.

I just did not see that it could call selectivity estimate functions. This
will mean adding a additional Paramer (PlannerInfo) to most of the geqo
functions, but I don't see a problem there.

Has anybody tried Geqo without ERX in recent time?

Andres

#73

Tom Lane

tgl@sss.pgh.pa.us

over 16 years ago

In reply to: Andres Freund (#72)

Re: *_collapse_limit, geqo_threshold

Andres Freund <andres@anarazel.de> writes:

Has anybody tried Geqo without ERX in recent time?

No, I don't think so. Anything that's ifdef'd out at the moment has
been that way for quite a few years, and so is likely broken anyhow :-(

regards, tom lane

#74

Andres Freund

andres@anarazel.de

over 16 years ago

In reply to: Andres Freund (#63)

1 attachment(s)

Re: *_collapse_limit, geqo_threshold

On Saturday 11 July 2009 17:19:18 Andres Freund wrote:

On Wednesday 08 July 2009 23:46:02 Tom Lane wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea....

If memory serves, we actually had exactly that at some point. But I
think the reason it got taken out was that it interfered with the
behavior of the random() function for everything else. We'd have to
give GEQO its own private random number generator.

All of GEQOs usage of random() seems to be concentrated to geqo_random.h -
so it would be a small change.
I will happily tackle that. If only to narrow down in which cases geqo
fails to plan - a behaviour we have seen at times at a bit more crazy
queries.

The only question I have is, whether random_r or similar is available on
enough platforms... Has anybody an idea about this?
On most unixoid system one could just wrap erand48() if random_r is not
available.

Sorry, had to stall the issue for a bit of time, here is a preliminary patch.

- added a "void *join_search_private" to hold the random number generator
which would also be usable by other join
- PlannerInfo *root is now passed to all non-static geqo related functions. It
seems cleaner to do this generally than on a function-by-function basis
- replaced the sparsely used GeqoPrivateData by GeqoEvalData which is passed
via join_search_private
- if geqo_seed = 0 a global/per-backend state is used

Whats missing:
- Windows prng support
- Perhaps: Configure check for erand48
- Documentation for geqo_seed

I did not find any windows api for a random number generator with
visible/changable state, so I think there is not much alternative to pulling
in random_r or similar from *bsd if this seems worth pursuing

Mostly sensible?

Andres

Attachments:

0001-Support-a-geqo_seed-GUC-to-make-planning-via-GEQO-re.patchtext/x-patch; charset=UTF-8; name=0001-Support-a-geqo_seed-GUC-to-make-planning-via-GEQO-re.patchDownload

From 7d4d96e68f9e6740c7034acc9da5e511001c2aa7 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 14 Jul 2009 02:25:18 +0200
Subject: [PATCH] Support a 'geqo_seed' GUC to make planning via GEQO repeatable.

---
 src/backend/optimizer/geqo/Makefile             |    1 +
 src/backend/optimizer/geqo/geqo_copy.c          |    4 +-
 src/backend/optimizer/geqo/geqo_cx.c            |    6 +-
 src/backend/optimizer/geqo/geqo_erx.c           |   47 ++++++------
 src/backend/optimizer/geqo/geqo_eval.c          |   28 ++++----
 src/backend/optimizer/geqo/geqo_main.c          |   90 ++++++++++++-----------
 src/backend/optimizer/geqo/geqo_mutation.c      |   10 +-
 src/backend/optimizer/geqo/geqo_ox1.c           |    8 +-
 src/backend/optimizer/geqo/geqo_ox2.c           |    7 +-
 src/backend/optimizer/geqo/geqo_pmx.c           |    7 +-
 src/backend/optimizer/geqo/geqo_pool.c          |   23 +++---
 src/backend/optimizer/geqo/geqo_px.c            |    8 +-
 src/backend/optimizer/geqo/geqo_random.c        |   42 +++++++++++
 src/backend/optimizer/geqo/geqo_recombination.c |    9 +-
 src/backend/optimizer/geqo/geqo_selection.c     |   19 +++--
 src/backend/optimizer/path/allpaths.c           |    2 +
 src/backend/utils/misc/guc.c                    |    9 ++
 src/include/nodes/relation.h                    |    3 +
 src/include/optimizer/geqo.h                    |   17 ++---
 src/include/optimizer/geqo_copy.h               |    3 +-
 src/include/optimizer/geqo_mutation.h           |    3 +-
 src/include/optimizer/geqo_pool.h               |   14 ++--
 src/include/optimizer/geqo_random.h             |    9 +-
 src/include/optimizer/geqo_recombination.h      |   33 +++++---
 src/include/optimizer/geqo_selection.h          |    4 +-
 25 files changed, 244 insertions(+), 162 deletions(-)
 create mode 100644 src/backend/optimizer/geqo/geqo_random.c

diff --git a/src/backend/optimizer/geqo/Makefile b/src/backend/optimizer/geqo/Makefile
index dbc6c28..e5a01d7 100644
*** a/src/backend/optimizer/geqo/Makefile
--- b/src/backend/optimizer/geqo/Makefile
*************** top_builddir = ../../../..
*** 14,19 ****
--- 14,20 ----
  include $(top_builddir)/src/Makefile.global
  
  OBJS =	geqo_copy.o geqo_eval.o geqo_main.o geqo_misc.o \
+ 	geqo_random.o \
  	geqo_mutation.o geqo_pool.o geqo_recombination.o \
  	geqo_selection.o \
  	geqo_erx.o geqo_pmx.o geqo_cx.o geqo_px.o geqo_ox1.o geqo_ox2.o
diff --git a/src/backend/optimizer/geqo/geqo_copy.c b/src/backend/optimizer/geqo/geqo_copy.c
index 83af33a..373a924 100644
*** a/src/backend/optimizer/geqo/geqo_copy.c
--- b/src/backend/optimizer/geqo/geqo_copy.c
***************
*** 35,40 ****
--- 35,41 ----
  
  #include "postgres.h"
  #include "optimizer/geqo_copy.h"
+ #include "optimizer/geqo_copy.h"
  
  /* geqo_copy
   *
***************
*** 42,48 ****
   *
   */
  void
! geqo_copy(Chromosome *chromo1, Chromosome *chromo2, int string_length)
  {
  	int			i;
  
--- 43,50 ----
   *
   */
  void
! geqo_copy(PlannerInfo *root, Chromosome *chromo1, Chromosome *chromo2,
! 		  int string_length)
  {
  	int			i;
  
diff --git a/src/backend/optimizer/geqo/geqo_cx.c b/src/backend/optimizer/geqo/geqo_cx.c
index 3d5102f..ad861ce 100644
*** a/src/backend/optimizer/geqo/geqo_cx.c
--- b/src/backend/optimizer/geqo/geqo_cx.c
***************
*** 35,40 ****
--- 35,41 ----
  
  
  #include "postgres.h"
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_recombination.h"
  #include "optimizer/geqo_random.h"
  
***************
*** 44,50 ****
   *	 cycle crossover
   */
  int
! cx(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table)
  {
  
  	int			i,
--- 45,52 ----
   *	 cycle crossover
   */
  int
! cx(PlannerInfo *root, Gene *tour1, Gene *tour2, Gene *offspring,
!    int num_gene, City *city_table)
  {
  
  	int			i,
*************** cx(Gene *tour1, Gene *tour2, Gene *offsp
*** 62,68 ****
  	}
  
  	/* choose random cycle starting position */
! 	start_pos = geqo_randint(num_gene - 1, 0);
  
  	/* child inherits first city  */
  	offspring[start_pos] = tour1[start_pos];
--- 64,70 ----
  	}
  
  	/* choose random cycle starting position */
! 	start_pos = geqo_randint(root, num_gene - 1, 0);
  
  	/* child inherits first city  */
  	offspring[start_pos] = tour1[start_pos];
diff --git a/src/backend/optimizer/geqo/geqo_erx.c b/src/backend/optimizer/geqo/geqo_erx.c
index 35e1a28..5bae059 100644
*** a/src/backend/optimizer/geqo/geqo_erx.c
--- b/src/backend/optimizer/geqo/geqo_erx.c
***************
*** 36,46 ****
  #include "optimizer/geqo_random.h"
  
  
! static int	gimme_edge(Gene gene1, Gene gene2, Edge *edge_table);
! static void remove_gene(Gene gene, Edge edge, Edge *edge_table);
! static Gene gimme_gene(Edge edge, Edge *edge_table);
  
! static Gene edge_failure(Gene *gene, int index, Edge *edge_table, int num_gene);
  
  
  /* alloc_edge_table
--- 36,46 ----
  #include "optimizer/geqo_random.h"
  
  
! static int	gimme_edge(PlannerInfo *root, Gene gene1, Gene gene2, Edge *edge_table);
! static void remove_gene(PlannerInfo *root, Gene gene, Edge edge, Edge *edge_table);
! static Gene gimme_gene(PlannerInfo *root, Edge edge, Edge *edge_table);
  
! static Gene edge_failure(PlannerInfo *root, Gene *gene, int index, Edge *edge_table, int num_gene);
  
  
  /* alloc_edge_table
*************** static Gene edge_failure(Gene *gene, int
*** 50,56 ****
   */
  
  Edge *
! alloc_edge_table(int num_gene)
  {
  	Edge	   *edge_table;
  
--- 50,56 ----
   */
  
  Edge *
! alloc_edge_table(PlannerInfo *root, int num_gene)
  {
  	Edge	   *edge_table;
  
*************** alloc_edge_table(int num_gene)
*** 70,76 ****
   *
   */
  void
! free_edge_table(Edge *edge_table)
  {
  	pfree(edge_table);
  }
--- 70,76 ----
   *
   */
  void
! free_edge_table(PlannerInfo *root, Edge *edge_table)
  {
  	pfree(edge_table);
  }
*************** free_edge_table(Edge *edge_table)
*** 89,95 ****
   *
   */
  float
! gimme_edge_table(Gene *tour1, Gene *tour2, int num_gene, Edge *edge_table)
  {
  	int			i,
  				index1,
--- 89,96 ----
   *
   */
  float
! gimme_edge_table(PlannerInfo *root, Gene *tour1, Gene *tour2,
! 				 int num_gene, Edge *edge_table)
  {
  	int			i,
  				index1,
*************** gimme_edge_table(Gene *tour1, Gene *tour
*** 121,131 ****
  		 * twice per edge
  		 */
  
! 		edge_total += gimme_edge(tour1[index1], tour1[index2], edge_table);
! 		gimme_edge(tour1[index2], tour1[index1], edge_table);
  
! 		edge_total += gimme_edge(tour2[index1], tour2[index2], edge_table);
! 		gimme_edge(tour2[index2], tour2[index1], edge_table);
  	}
  
  	/* return average number of edges per index */
--- 122,132 ----
  		 * twice per edge
  		 */
  
! 		edge_total += gimme_edge(root, tour1[index1], tour1[index2], edge_table);
! 		gimme_edge(root, tour1[index2], tour1[index1], edge_table);
  
! 		edge_total += gimme_edge(root, tour2[index1], tour2[index2], edge_table);
! 		gimme_edge(root, tour2[index2], tour2[index1], edge_table);
  	}
  
  	/* return average number of edges per index */
*************** gimme_edge_table(Gene *tour1, Gene *tour
*** 147,153 ****
   *			  0 if edge was already registered and edge_table is unchanged
   */
  static int
! gimme_edge(Gene gene1, Gene gene2, Edge *edge_table)
  {
  	int			i;
  	int			edges;
--- 148,154 ----
   *			  0 if edge was already registered and edge_table is unchanged
   */
  static int
! gimme_edge(PlannerInfo *root, Gene gene1, Gene gene2, Edge *edge_table)
  {
  	int			i;
  	int			edges;
*************** gimme_edge(Gene gene1, Gene gene2, Edge 
*** 189,200 ****
   *
   */
  int
! gimme_tour(Edge *edge_table, Gene *new_gene, int num_gene)
  {
  	int			i;
  	int			edge_failures = 0;
  
! 	new_gene[0] = (Gene) geqo_randint(num_gene, 1);		/* choose int between 1
  														 * and num_gene */
  
  	for (i = 1; i < num_gene; i++)
--- 190,201 ----
   *
   */
  int
! gimme_tour(PlannerInfo *root, Edge *edge_table, Gene *new_gene, int num_gene)
  {
  	int			i;
  	int			edge_failures = 0;
  
! 	new_gene[0] = (Gene) geqo_randint(root, num_gene, 1);		/* choose int between 1
  														 * and num_gene */
  
  	for (i = 1; i < num_gene; i++)
*************** gimme_tour(Edge *edge_table, Gene *new_g
*** 204,221 ****
  		 * table
  		 */
  
! 		remove_gene(new_gene[i - 1], edge_table[(int) new_gene[i - 1]], edge_table);
  
  		/* find destination for the newly entered point */
  
  		if (edge_table[new_gene[i - 1]].unused_edges > 0)
! 			new_gene[i] = gimme_gene(edge_table[(int) new_gene[i - 1]], edge_table);
  
  		else
  		{						/* cope with fault */
  			edge_failures++;
  
! 			new_gene[i] = edge_failure(new_gene, i - 1, edge_table, num_gene);
  		}
  
  		/* mark this node as incorporated */
--- 205,222 ----
  		 * table
  		 */
  
! 		remove_gene(root, new_gene[i - 1], edge_table[(int) new_gene[i - 1]], edge_table);
  
  		/* find destination for the newly entered point */
  
  		if (edge_table[new_gene[i - 1]].unused_edges > 0)
! 			new_gene[i] = gimme_gene(root, edge_table[(int) new_gene[i - 1]], edge_table);
  
  		else
  		{						/* cope with fault */
  			edge_failures++;
  
! 			new_gene[i] = edge_failure(root, new_gene, i - 1, edge_table, num_gene);
  		}
  
  		/* mark this node as incorporated */
*************** gimme_tour(Edge *edge_table, Gene *new_g
*** 235,241 ****
   *
   */
  static void
! remove_gene(Gene gene, Edge edge, Edge *edge_table)
  {
  	int			i,
  				j;
--- 236,242 ----
   *
   */
  static void
! remove_gene(PlannerInfo *root, Gene gene, Edge edge, Edge *edge_table)
  {
  	int			i,
  				j;
*************** remove_gene(Gene gene, Edge edge, Edge *
*** 277,283 ****
   *
   */
  static Gene
! gimme_gene(Edge edge, Edge *edge_table)
  {
  	int			i;
  	Gene		friend;
--- 278,284 ----
   *
   */
  static Gene
! gimme_gene(PlannerInfo *root, Edge edge, Edge *edge_table)
  {
  	int			i;
  	Gene		friend;
*************** gimme_gene(Edge edge, Edge *edge_table)
*** 340,346 ****
  
  
  	/* random decision of the possible candidates to use */
! 	rand_decision = (int) geqo_randint(minimum_count - 1, 0);
  
  
  	for (i = 0; i < edge.unused_edges; i++)
--- 341,347 ----
  
  
  	/* random decision of the possible candidates to use */
! 	rand_decision = geqo_randint(root, minimum_count - 1, 0);
  
  
  	for (i = 0; i < edge.unused_edges; i++)
*************** gimme_gene(Edge edge, Edge *edge_table)
*** 368,374 ****
   *
   */
  static Gene
! edge_failure(Gene *gene, int index, Edge *edge_table, int num_gene)
  {
  	int			i;
  	Gene		fail_gene = gene[index];
--- 369,375 ----
   *
   */
  static Gene
! edge_failure(PlannerInfo *root, Gene *gene, int index, Edge *edge_table, int num_gene)
  {
  	int			i;
  	Gene		fail_gene = gene[index];
*************** edge_failure(Gene *gene, int index, Edge
*** 401,407 ****
  	if (four_count != 0)
  	{
  
! 		rand_decision = (int) geqo_randint(four_count - 1, 0);
  
  		for (i = 1; i <= num_gene; i++)
  		{
--- 402,408 ----
  	if (four_count != 0)
  	{
  
! 		rand_decision = geqo_randint(root, four_count - 1, 0);
  
  		for (i = 1; i <= num_gene; i++)
  		{
*************** edge_failure(Gene *gene, int index, Edge
*** 423,429 ****
  	else if (remaining_edges != 0)
  	{
  		/* random decision of the gene with remaining edges */
! 		rand_decision = (int) geqo_randint(remaining_edges - 1, 0);
  
  		for (i = 1; i <= num_gene; i++)
  		{
--- 424,430 ----
  	else if (remaining_edges != 0)
  	{
  		/* random decision of the gene with remaining edges */
! 		rand_decision = geqo_randint(root, remaining_edges - 1, 0);
  
  		for (i = 1; i <= num_gene; i++)
  		{
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index 04b3bfe..11903a5 100644
*** a/src/backend/optimizer/geqo/geqo_eval.c
--- b/src/backend/optimizer/geqo/geqo_eval.c
*************** static bool desirable_join(PlannerInfo *
*** 42,48 ****
   * Returns cost of a query tree as an individual of the population.
   */
  Cost
! geqo_eval(Gene *tour, int num_gene, GeqoEvalData *evaldata)
  {
  	MemoryContext mycontext;
  	MemoryContext oldcxt;
--- 42,48 ----
   * Returns cost of a query tree as an individual of the population.
   */
  Cost
! geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
  {
  	MemoryContext mycontext;
  	MemoryContext oldcxt;
*************** geqo_eval(Gene *tour, int num_gene, Geqo
*** 94,106 ****
  	 * (If we are dealing with enough join rels, which we very likely are, a
  	 * new hash table will get built and used locally.)
  	 */
! 	savelength = list_length(evaldata->root->join_rel_list);
! 	savehash = evaldata->root->join_rel_hash;
  
! 	evaldata->root->join_rel_hash = NULL;
  
  	/* construct the best path for the given combination of relations */
! 	joinrel = gimme_tree(tour, num_gene, evaldata);
  
  	/*
  	 * compute fitness
--- 94,106 ----
  	 * (If we are dealing with enough join rels, which we very likely are, a
  	 * new hash table will get built and used locally.)
  	 */
! 	savelength = list_length(root->join_rel_list);
! 	savehash = root->join_rel_hash;
  
! 	root->join_rel_hash = NULL;
  
  	/* construct the best path for the given combination of relations */
! 	joinrel = gimme_tree(root, tour, num_gene);
  
  	/*
  	 * compute fitness
*************** geqo_eval(Gene *tour, int num_gene, Geqo
*** 117,125 ****
  	 * Restore join_rel_list to its former state, and put back original
  	 * hashtable if any.
  	 */
! 	evaldata->root->join_rel_list = list_truncate(evaldata->root->join_rel_list,
! 												  savelength);
! 	evaldata->root->join_rel_hash = savehash;
  
  	/* release all the memory acquired within gimme_tree */
  	MemoryContextSwitchTo(oldcxt);
--- 117,125 ----
  	 * Restore join_rel_list to its former state, and put back original
  	 * hashtable if any.
  	 */
! 	root->join_rel_list = list_truncate(root->join_rel_list,
! 										savelength);
! 	root->join_rel_hash = savehash;
  
  	/* release all the memory acquired within gimme_tree */
  	MemoryContextSwitchTo(oldcxt);
*************** geqo_eval(Gene *tour, int num_gene, Geqo
*** 134,140 ****
   *	  order.
   *
   *	 'tour' is the proposed join order, of length 'num_gene'
-  *	 'evaldata' contains the context we need
   *
   * Returns a new join relation whose cheapest path is the best plan for
   * this join order.  NB: will return NULL if join order is invalid.
--- 134,139 ----
*************** geqo_eval(Gene *tour, int num_gene, Geqo
*** 153,159 ****
   * plans.
   */
  RelOptInfo *
! gimme_tree(Gene *tour, int num_gene, GeqoEvalData *evaldata)
  {
  	RelOptInfo **stack;
  	int			stack_depth;
--- 152,158 ----
   * plans.
   */
  RelOptInfo *
! gimme_tree(PlannerInfo *root, Gene *tour, int num_gene)
  {
  	RelOptInfo **stack;
  	int			stack_depth;
*************** gimme_tree(Gene *tour, int num_gene, Geq
*** 193,200 ****
  
  		/* Get the next input relation and push it */
  		cur_rel_index = (int) tour[rel_count];
! 		stack[stack_depth] = (RelOptInfo *) list_nth(evaldata->initial_rels,
! 													 cur_rel_index - 1);
  		stack_depth++;
  
  		/*
--- 192,200 ----
  
  		/* Get the next input relation and push it */
  		cur_rel_index = (int) tour[rel_count];
! 		stack[stack_depth] = (RelOptInfo *) list_nth(
! 			((GeqoPrivateData*)root->join_search_private)->initial_rels,
! 			cur_rel_index - 1);
  		stack_depth++;
  
  		/*
*************** gimme_tree(Gene *tour, int num_gene, Geq
*** 211,217 ****
  			 * have exhausted the input, the heuristics can't prevent popping.
  			 */
  			if (rel_count < num_gene - 1 &&
! 				!desirable_join(evaldata->root, outer_rel, inner_rel))
  				break;
  
  			/*
--- 211,217 ----
  			 * have exhausted the input, the heuristics can't prevent popping.
  			 */
  			if (rel_count < num_gene - 1 &&
! 				!desirable_join(root, outer_rel, inner_rel))
  				break;
  
  			/*
*************** gimme_tree(Gene *tour, int num_gene, Geq
*** 220,226 ****
  			 * root->join_rel_list yet, and so the paths constructed for it
  			 * will only include the ones we want.
  			 */
! 			joinrel = make_join_rel(evaldata->root, outer_rel, inner_rel);
  
  			/* Can't pop stack here if join order is not valid */
  			if (!joinrel)
--- 220,226 ----
  			 * root->join_rel_list yet, and so the paths constructed for it
  			 * will only include the ones we want.
  			 */
! 			joinrel = make_join_rel(root, outer_rel, inner_rel);
  
  			/* Can't pop stack here if join order is not valid */
  			if (!joinrel)
diff --git a/src/backend/optimizer/geqo/geqo_main.c b/src/backend/optimizer/geqo/geqo_main.c
index 72b0b57..10f64f4 100644
*** a/src/backend/optimizer/geqo/geqo_main.c
--- b/src/backend/optimizer/geqo/geqo_main.c
***************
*** 29,34 ****
--- 29,35 ----
  #include "optimizer/geqo_misc.h"
  #include "optimizer/geqo_pool.h"
  #include "optimizer/geqo_selection.h"
+ #include "optimizer/geqo_random.h"
  
  
  /*
*************** int			Geqo_effort;
*** 38,43 ****
--- 39,45 ----
  int			Geqo_pool_size;
  int			Geqo_generations;
  double		Geqo_selection_bias;
+ double		Geqo_seed;
  
  
  static int	gimme_pool_size(int nr_rel);
*************** static int	gimme_number_generations(int 
*** 63,69 ****
  RelOptInfo *
  geqo(PlannerInfo *root, int number_of_rels, List *initial_rels)
  {
- 	GeqoEvalData evaldata;
  	int			generation;
  	Chromosome *momma;
  	Chromosome *daddy;
--- 65,70 ----
*************** geqo(PlannerInfo *root, int number_of_re
*** 88,96 ****
  	int			mutations = 0;
  #endif
  
! /* set up evaldata */
! 	evaldata.root = root;
! 	evaldata.initial_rels = initial_rels;
  
  /* set GA parameters */
  	pool_size = gimme_pool_size(number_of_rels);
--- 89,102 ----
  	int			mutations = 0;
  #endif
  
! /* setup private information */
! 	root->join_search_private = (GeqoPrivateData*)palloc0(sizeof(GeqoPrivateData));
! 	((GeqoPrivateData*)root->join_search_private)->initial_rels = initial_rels;
! 
! /* initialize private number generator */
! 	if(Geqo_seed != 0){
! 		geqo_seed(root, Geqo_seed);
! 	}
  
  /* set GA parameters */
  	pool_size = gimme_pool_size(number_of_rels);
*************** geqo(PlannerInfo *root, int number_of_re
*** 98,110 ****
  	status_interval = 10;
  
  /* allocate genetic pool memory */
! 	pool = alloc_pool(pool_size, number_of_rels);
  
  /* random initialization of the pool */
! 	random_init_pool(pool, &evaldata);
  
  /* sort the pool according to cheapest path as fitness */
! 	sort_pool(pool);			/* we have to do it only one time, since all
  								 * kids replace the worst individuals in
  								 * future (-> geqo_pool.c:spread_chromo ) */
  
--- 104,116 ----
  	status_interval = 10;
  
  /* allocate genetic pool memory */
! 	pool = alloc_pool(root, pool_size, number_of_rels);
  
  /* random initialization of the pool */
! 	random_init_pool(root, pool);
  
  /* sort the pool according to cheapest path as fitness */
! 	sort_pool(root, pool);			/* we have to do it only one time, since all
  								 * kids replace the worst individuals in
  								 * future (-> geqo_pool.c:spread_chromo ) */
  
*************** geqo(PlannerInfo *root, int number_of_re
*** 116,164 ****
  #endif
  
  /* allocate chromosome momma and daddy memory */
! 	momma = alloc_chromo(pool->string_length);
! 	daddy = alloc_chromo(pool->string_length);
  
  #if defined (ERX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using edge recombination crossover [ERX]");
  #endif
  /* allocate edge table memory */
! 	edge_table = alloc_edge_table(pool->string_length);
  #elif defined(PMX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using partially matched crossover [PMX]");
  #endif
  /* allocate chromosome kid memory */
! 	kid = alloc_chromo(pool->string_length);
  #elif defined(CX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using cycle crossover [CX]");
  #endif
  /* allocate city table memory */
! 	kid = alloc_chromo(pool->string_length);
! 	city_table = alloc_city_table(pool->string_length);
  #elif defined(PX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using position crossover [PX]");
  #endif
  /* allocate city table memory */
! 	kid = alloc_chromo(pool->string_length);
! 	city_table = alloc_city_table(pool->string_length);
  #elif defined(OX1)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using order crossover [OX1]");
  #endif
  /* allocate city table memory */
  	kid = alloc_chromo(pool->string_length);
! 	city_table = alloc_city_table(pool->string_length);
  #elif defined(OX2)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using order crossover [OX2]");
  #endif
  /* allocate city table memory */
  	kid = alloc_chromo(pool->string_length);
! 	city_table = alloc_city_table(pool->string_length);
  #endif
  
  
--- 122,170 ----
  #endif
  
  /* allocate chromosome momma and daddy memory */
! 	momma = alloc_chromo(root, pool->string_length);
! 	daddy = alloc_chromo(root, pool->string_length);
  
  #if defined (ERX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using edge recombination crossover [ERX]");
  #endif
  /* allocate edge table memory */
! 	edge_table = alloc_edge_table(root, pool->string_length);
  #elif defined(PMX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using partially matched crossover [PMX]");
  #endif
  /* allocate chromosome kid memory */
! 	kid = alloc_chromo(root, pool->string_length);
  #elif defined(CX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using cycle crossover [CX]");
  #endif
  /* allocate city table memory */
! 	kid = alloc_chromo(root, pool->string_length);
! 	city_table = alloc_city_table(root, pool->string_length);
  #elif defined(PX)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using position crossover [PX]");
  #endif
  /* allocate city table memory */
! 	kid = alloc_chromo(root, pool->string_length);
! 	city_table = alloc_city_table(root, pool->string_length);
  #elif defined(OX1)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using order crossover [OX1]");
  #endif
  /* allocate city table memory */
  	kid = alloc_chromo(pool->string_length);
! 	city_table = alloc_city_table(root, pool->string_length);
  #elif defined(OX2)
  #ifdef GEQO_DEBUG
  	elog(DEBUG2, "using order crossover [OX2]");
  #endif
  /* allocate city table memory */
  	kid = alloc_chromo(pool->string_length);
! 	city_table = alloc_city_table(root, pool->string_length);
  #endif
  
  
*************** geqo(PlannerInfo *root, int number_of_re
*** 168,212 ****
  	for (generation = 0; generation < number_generations; generation++)
  	{
  		/* SELECTION: using linear bias function */
! 		geqo_selection(momma, daddy, pool, Geqo_selection_bias);
  
  #if defined (ERX)
  		/* EDGE RECOMBINATION CROSSOVER */
! 		difference = gimme_edge_table(momma->string, daddy->string, pool->string_length, edge_table);
  
  		kid = momma;
  
  		/* are there any edge failures ? */
! 		edge_failures += gimme_tour(edge_table, kid->string, pool->string_length);
  #elif defined(PMX)
  		/* PARTIALLY MATCHED CROSSOVER */
! 		pmx(momma->string, daddy->string, kid->string, pool->string_length);
  #elif defined(CX)
  		/* CYCLE CROSSOVER */
! 		cycle_diffs = cx(momma->string, daddy->string, kid->string, pool->string_length, city_table);
  		/* mutate the child */
  		if (cycle_diffs == 0)
  		{
  			mutations++;
! 			geqo_mutation(kid->string, pool->string_length);
  		}
  #elif defined(PX)
  		/* POSITION CROSSOVER */
! 		px(momma->string, daddy->string, kid->string, pool->string_length, city_table);
  #elif defined(OX1)
  		/* ORDER CROSSOVER */
! 		ox1(momma->string, daddy->string, kid->string, pool->string_length, city_table);
  #elif defined(OX2)
  		/* ORDER CROSSOVER */
! 		ox2(momma->string, daddy->string, kid->string, pool->string_length, city_table);
  #endif
  
  
  		/* EVALUATE FITNESS */
! 		kid->worth = geqo_eval(kid->string, pool->string_length, &evaldata);
  
  		/* push the kid into the wilderness of life according to its worth */
! 		spread_chromo(kid, pool);
  
  
  #ifdef GEQO_DEBUG
--- 174,218 ----
  	for (generation = 0; generation < number_generations; generation++)
  	{
  		/* SELECTION: using linear bias function */
! 		geqo_selection(root, momma, daddy, pool, Geqo_selection_bias);
  
  #if defined (ERX)
  		/* EDGE RECOMBINATION CROSSOVER */
! 		difference = gimme_edge_table(root, momma->string, daddy->string, pool->string_length, edge_table);
  
  		kid = momma;
  
  		/* are there any edge failures ? */
! 		edge_failures += gimme_tour(root, edge_table, kid->string, pool->string_length);
  #elif defined(PMX)
  		/* PARTIALLY MATCHED CROSSOVER */
! 		pmx(root, momma->string, daddy->string, kid->string, pool->string_length);
  #elif defined(CX)
  		/* CYCLE CROSSOVER */
! 		cycle_diffs = cx(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
  		/* mutate the child */
  		if (cycle_diffs == 0)
  		{
  			mutations++;
! 			geqo_mutation(root, kid->string, pool->string_length);
  		}
  #elif defined(PX)
  		/* POSITION CROSSOVER */
! 		px(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
  #elif defined(OX1)
  		/* ORDER CROSSOVER */
! 		ox1(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
  #elif defined(OX2)
  		/* ORDER CROSSOVER */
! 		ox2(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
  #endif
  
  
  		/* EVALUATE FITNESS */
! 		kid->worth = geqo_eval(root, kid->string, pool->string_length);
  
  		/* push the kid into the wilderness of life according to its worth */
! 		spread_chromo(root, kid, pool);
  
  
  #ifdef GEQO_DEBUG
*************** geqo(PlannerInfo *root, int number_of_re
*** 249,255 ****
  	 */
  	best_tour = (Gene *) pool->data[0].string;
  
! 	best_rel = gimme_tree(best_tour, pool->string_length, &evaldata);
  
  	if (best_rel == NULL)
  		elog(ERROR, "failed to make a valid plan");
--- 255,261 ----
  	 */
  	best_tour = (Gene *) pool->data[0].string;
  
! 	best_rel = gimme_tree(root, best_tour, pool->string_length);
  
  	if (best_rel == NULL)
  		elog(ERROR, "failed to make a valid plan");
*************** geqo(PlannerInfo *root, int number_of_re
*** 260,287 ****
  #endif
  
  	/* ... free memory stuff */
! 	free_chromo(momma);
! 	free_chromo(daddy);
  
  #if defined (ERX)
! 	free_edge_table(edge_table);
  #elif defined(PMX)
! 	free_chromo(kid);
  #elif defined(CX)
! 	free_chromo(kid);
! 	free_city_table(city_table);
  #elif defined(PX)
! 	free_chromo(kid);
! 	free_city_table(city_table);
  #elif defined(OX1)
! 	free_chromo(kid);
! 	free_city_table(city_table);
  #elif defined(OX2)
! 	free_chromo(kid);
! 	free_city_table(city_table);
  #endif
  
! 	free_pool(pool);
  
  	return best_rel;
  }
--- 266,293 ----
  #endif
  
  	/* ... free memory stuff */
! 	free_chromo(root, momma);
! 	free_chromo(root, daddy);
  
  #if defined (ERX)
! 	free_edge_table(root, edge_table);
  #elif defined(PMX)
! 	free_chromo(root, kid);
  #elif defined(CX)
! 	free_chromo(root, kid);
! 	free_city_table(root, city_table);
  #elif defined(PX)
! 	free_chromo(root, kid);
! 	free_city_table(root, city_table);
  #elif defined(OX1)
! 	free_chromo(root, kid);
! 	free_city_table(root, city_table);
  #elif defined(OX2)
! 	free_chromo(root, kid);
! 	free_city_table(root, city_table);
  #endif
  
! 	free_pool(root, pool);
  
  	return best_rel;
  }
diff --git a/src/backend/optimizer/geqo/geqo_mutation.c b/src/backend/optimizer/geqo/geqo_mutation.c
index 899410c..acf3b34 100644
*** a/src/backend/optimizer/geqo/geqo_mutation.c
--- b/src/backend/optimizer/geqo/geqo_mutation.c
***************
*** 36,56 ****
  #include "optimizer/geqo_random.h"
  
  void
! geqo_mutation(Gene *tour, int num_gene)
  {
  	int			swap1;
  	int			swap2;
! 	int			num_swaps = geqo_randint(num_gene / 3, 0);
  	Gene		temp;
  
  
  	while (num_swaps > 0)
  	{
! 		swap1 = geqo_randint(num_gene - 1, 0);
! 		swap2 = geqo_randint(num_gene - 1, 0);
  
  		while (swap1 == swap2)
! 			swap2 = geqo_randint(num_gene - 1, 0);
  
  		temp = tour[swap1];
  		tour[swap1] = tour[swap2];
--- 36,56 ----
  #include "optimizer/geqo_random.h"
  
  void
! geqo_mutation(PlannerInfo *root, Gene *tour, int num_gene)
  {
  	int			swap1;
  	int			swap2;
! 	int			num_swaps = geqo_randint(root, num_gene / 3, 0);
  	Gene		temp;
  
  
  	while (num_swaps > 0)
  	{
! 		swap1 = geqo_randint(root, num_gene - 1, 0);
! 		swap2 = geqo_randint(root, num_gene - 1, 0);
  
  		while (swap1 == swap2)
! 			swap2 = geqo_randint(root, num_gene - 1, 0);
  
  		temp = tour[swap1];
  		tour[swap1] = tour[swap2];
diff --git a/src/backend/optimizer/geqo/geqo_ox1.c b/src/backend/optimizer/geqo/geqo_ox1.c
index cd979df..d292e98 100644
*** a/src/backend/optimizer/geqo/geqo_ox1.c
--- b/src/backend/optimizer/geqo/geqo_ox1.c
***************
*** 34,39 ****
--- 34,40 ----
  /*************************************************************/
  
  #include "postgres.h"
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_recombination.h"
  
***************
*** 43,49 ****
   *	 position crossover
   */
  void
! ox1(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table)
  {
  	int			left,
  				right,
--- 44,51 ----
   *	 position crossover
   */
  void
! ox1(PlannerInfo *root, Gene *tour1, Gene *tour2, Gene *offspring, int num_gene,
! 	City *city_table)
  {
  	int			left,
  				right,
*************** ox1(Gene *tour1, Gene *tour2, Gene *offs
*** 56,63 ****
  		city_table[k].used = 0;
  
  	/* select portion to copy from tour1 */
! 	left = geqo_randint(num_gene - 1, 0);
! 	right = geqo_randint(num_gene - 1, 0);
  
  	if (left > right)
  	{
--- 58,65 ----
  		city_table[k].used = 0;
  
  	/* select portion to copy from tour1 */
! 	left = geqo_randint(root, num_gene - 1, 0);
! 	right = geqo_randint(root, num_gene - 1, 0);
  
  	if (left > right)
  	{
diff --git a/src/backend/optimizer/geqo/geqo_ox2.c b/src/backend/optimizer/geqo/geqo_ox2.c
index 0d2e0de..a2fdfe8 100644
*** a/src/backend/optimizer/geqo/geqo_ox2.c
--- b/src/backend/optimizer/geqo/geqo_ox2.c
***************
*** 34,39 ****
--- 34,40 ----
  /*************************************************************/
  
  #include "postgres.h"
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_recombination.h"
  
***************
*** 43,49 ****
   *	 position crossover
   */
  void
! ox2(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table)
  {
  	int			k,
  				j,
--- 44,50 ----
   *	 position crossover
   */
  void
! ox2(PlannerInfo *root, Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table)
  {
  	int			k,
  				j,
*************** ox2(Gene *tour1, Gene *tour2, Gene *offs
*** 60,71 ****
  	}
  
  	/* determine the number of positions to be inherited from tour1  */
! 	num_positions = geqo_randint(2 * num_gene / 3, num_gene / 3);
  
  	/* make a list of selected cities */
  	for (k = 0; k < num_positions; k++)
  	{
! 		pos = geqo_randint(num_gene - 1, 0);
  		city_table[pos].select_list = (int) tour1[pos];
  		city_table[(int) tour1[pos]].used = 1;	/* mark used */
  	}
--- 61,72 ----
  	}
  
  	/* determine the number of positions to be inherited from tour1  */
! 	num_positions = geqo_randint(root, 2 * num_gene / 3, num_gene / 3);
  
  	/* make a list of selected cities */
  	for (k = 0; k < num_positions; k++)
  	{
! 		pos = geqo_randint(root, num_gene - 1, 0);
  		city_table[pos].select_list = (int) tour1[pos];
  		city_table[(int) tour1[pos]].used = 1;	/* mark used */
  	}
diff --git a/src/backend/optimizer/geqo/geqo_pmx.c b/src/backend/optimizer/geqo/geqo_pmx.c
index fe9d4b4..a8d18c5 100644
*** a/src/backend/optimizer/geqo/geqo_pmx.c
--- b/src/backend/optimizer/geqo/geqo_pmx.c
***************
*** 34,39 ****
--- 34,40 ----
  /*************************************************************/
  
  #include "postgres.h"
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_recombination.h"
  
***************
*** 43,49 ****
   *	 partially matched crossover
   */
  void
! pmx(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene)
  {
  	int		   *failed = (int *) palloc((num_gene + 1) * sizeof(int));
  	int		   *from = (int *) palloc((num_gene + 1) * sizeof(int));
--- 44,50 ----
   *	 partially matched crossover
   */
  void
! pmx(PlannerInfo *root, Gene *tour1, Gene *tour2, Gene *offspring, int num_gene)
  {
  	int		   *failed = (int *) palloc((num_gene + 1) * sizeof(int));
  	int		   *from = (int *) palloc((num_gene + 1) * sizeof(int));
*************** pmx(Gene *tour1, Gene *tour2, Gene *offs
*** 71,78 ****
  	}
  
  /* locate crossover points */
! 	left = geqo_randint(num_gene - 1, 0);
! 	right = geqo_randint(num_gene - 1, 0);
  
  	if (left > right)
  	{
--- 72,79 ----
  	}
  
  /* locate crossover points */
! 	left = geqo_randint(root, num_gene - 1, 0);
! 	right = geqo_randint(root, num_gene - 1, 0);
  
  	if (left > right)
  	{
diff --git a/src/backend/optimizer/geqo/geqo_pool.c b/src/backend/optimizer/geqo/geqo_pool.c
index 72eb34f..64e23ba 100644
*** a/src/backend/optimizer/geqo/geqo_pool.c
--- b/src/backend/optimizer/geqo/geqo_pool.c
*************** static int	compare(const void *arg1, con
*** 39,45 ****
   *		allocates memory for GA pool
   */
  Pool *
! alloc_pool(int pool_size, int string_length)
  {
  	Pool	   *new_pool;
  	Chromosome *chromo;
--- 39,45 ----
   *		allocates memory for GA pool
   */
  Pool *
! alloc_pool(PlannerInfo *root, int pool_size, int string_length)
  {
  	Pool	   *new_pool;
  	Chromosome *chromo;
*************** alloc_pool(int pool_size, int string_len
*** 66,72 ****
   *		deallocates memory for GA pool
   */
  void
! free_pool(Pool *pool)
  {
  	Chromosome *chromo;
  	int			i;
--- 66,72 ----
   *		deallocates memory for GA pool
   */
  void
! free_pool(PlannerInfo *root, Pool *pool)
  {
  	Chromosome *chromo;
  	int			i;
*************** free_pool(Pool *pool)
*** 88,94 ****
   *		initialize genetic pool
   */
  void
! random_init_pool(Pool *pool, GeqoEvalData *evaldata)
  {
  	Chromosome *chromo = (Chromosome *) pool->data;
  	int			i;
--- 88,94 ----
   *		initialize genetic pool
   */
  void
! random_init_pool(PlannerInfo *root, Pool *pool)
  {
  	Chromosome *chromo = (Chromosome *) pool->data;
  	int			i;
*************** random_init_pool(Pool *pool, GeqoEvalDat
*** 105,114 ****
  	i = 0;
  	while (i < pool->size)
  	{
! 		init_tour(chromo[i].string, pool->string_length);
! 		pool->data[i].worth = geqo_eval(chromo[i].string,
! 										pool->string_length,
! 										evaldata);
  		if (pool->data[i].worth < DBL_MAX)
  			i++;
  		else
--- 105,113 ----
  	i = 0;
  	while (i < pool->size)
  	{
! 		init_tour(root, chromo[i].string, pool->string_length);
! 		pool->data[i].worth = geqo_eval(root, chromo[i].string,
! 										pool->string_length);
  		if (pool->data[i].worth < DBL_MAX)
  			i++;
  		else
*************** random_init_pool(Pool *pool, GeqoEvalDat
*** 133,139 ****
   *	 maybe you have to change compare() for different ordering ...
   */
  void
! sort_pool(Pool *pool)
  {
  	qsort(pool->data, pool->size, sizeof(Chromosome), compare);
  }
--- 132,138 ----
   *	 maybe you have to change compare() for different ordering ...
   */
  void
! sort_pool(PlannerInfo *root, Pool *pool)
  {
  	qsort(pool->data, pool->size, sizeof(Chromosome), compare);
  }
*************** compare(const void *arg1, const void *ar
*** 160,166 ****
   *	  allocates a chromosome and string space
   */
  Chromosome *
! alloc_chromo(int string_length)
  {
  	Chromosome *chromo;
  
--- 159,165 ----
   *	  allocates a chromosome and string space
   */
  Chromosome *
! alloc_chromo(PlannerInfo *root, int string_length)
  {
  	Chromosome *chromo;
  
*************** alloc_chromo(int string_length)
*** 174,180 ****
   *	  deallocates a chromosome and string space
   */
  void
! free_chromo(Chromosome *chromo)
  {
  	pfree(chromo->string);
  	pfree(chromo);
--- 173,179 ----
   *	  deallocates a chromosome and string space
   */
  void
! free_chromo(PlannerInfo *root, Chromosome *chromo)
  {
  	pfree(chromo->string);
  	pfree(chromo);
*************** free_chromo(Chromosome *chromo)
*** 185,191 ****
   *	 assumes best->worst = smallest->largest
   */
  void
! spread_chromo(Chromosome *chromo, Pool *pool)
  {
  	int			top,
  				mid,
--- 184,190 ----
   *	 assumes best->worst = smallest->largest
   */
  void
! spread_chromo(PlannerInfo *root, Chromosome *chromo, Pool *pool)
  {
  	int			top,
  				mid,
*************** spread_chromo(Chromosome *chromo, Pool *
*** 247,253 ****
  	 * copy new gene into pool storage; always replace worst gene in pool
  	 */
  
! 	geqo_copy(&pool->data[pool->size - 1], chromo, pool->string_length);
  
  	swap_chromo.string = pool->data[pool->size - 1].string;
  	swap_chromo.worth = pool->data[pool->size - 1].worth;
--- 246,252 ----
  	 * copy new gene into pool storage; always replace worst gene in pool
  	 */
  
! 	geqo_copy(root, &pool->data[pool->size - 1], chromo, pool->string_length);
  
  	swap_chromo.string = pool->data[pool->size - 1].string;
  	swap_chromo.worth = pool->data[pool->size - 1].worth;
diff --git a/src/backend/optimizer/geqo/geqo_px.c b/src/backend/optimizer/geqo/geqo_px.c
index 07f41cd..9e9cee2 100644
*** a/src/backend/optimizer/geqo/geqo_px.c
--- b/src/backend/optimizer/geqo/geqo_px.c
***************
*** 34,39 ****
--- 34,40 ----
  /*************************************************************/
  
  #include "postgres.h"
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_recombination.h"
  
***************
*** 43,49 ****
   *	 position crossover
   */
  void
! px(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table)
  {
  
  	int			num_positions;
--- 44,51 ----
   *	 position crossover
   */
  void
! px(PlannerInfo *root, Gene *tour1, Gene *tour2, Gene *offspring, int num_gene,
!    City *city_table)
  {
  
  	int			num_positions;
*************** px(Gene *tour1, Gene *tour2, Gene *offsp
*** 57,68 ****
  		city_table[i].used = 0;
  
  	/* choose random positions that will be inherited directly from parent */
! 	num_positions = geqo_randint(2 * num_gene / 3, num_gene / 3);
  
  	/* choose random position */
  	for (i = 0; i < num_positions; i++)
  	{
! 		pos = geqo_randint(num_gene - 1, 0);
  
  		offspring[pos] = tour1[pos];	/* transfer cities to child */
  		city_table[(int) tour1[pos]].used = 1;	/* mark city used */
--- 59,70 ----
  		city_table[i].used = 0;
  
  	/* choose random positions that will be inherited directly from parent */
! 	num_positions = geqo_randint(root, 2 * num_gene / 3, num_gene / 3);
  
  	/* choose random position */
  	for (i = 0; i < num_positions; i++)
  	{
! 		pos = geqo_randint(root, num_gene - 1, 0);
  
  		offspring[pos] = tour1[pos];	/* transfer cities to child */
  		city_table[(int) tour1[pos]].used = 1;	/* mark city used */
diff --git a/src/backend/optimizer/geqo/geqo_random.c b/src/backend/optimizer/geqo/geqo_random.c
index ...59d5e42 .
*** a/src/backend/optimizer/geqo/geqo_random.c
--- b/src/backend/optimizer/geqo/geqo_random.c
***************
*** 0 ****
--- 1,42 ----
+ /*------------------------------------------------------------------------
+  *
+  * geqo_misc.c
+  *	   misc. printout and debug stuff
+  *
+  * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ 
+ #include "optimizer/geqo.h"
+ #include "optimizer/geqo_random.h"
+ 
+ 
+ static unsigned short global_random_state[3];
+ 
+ void geqo_seed(PlannerInfo *root, double seed){
+ 	GeqoPrivateData *private = (GeqoPrivateData*)root->join_search_private;
+ 
+ 	private->random_state = palloc(sizeof(global_random_state));
+ 
+ 	/*
+ 	 * XXX. This seeding algorithm could certainly be improved - but
+ 	 * it is not critical to do so.
+ 	 */
+ 	memcpy(private->random_state,
+ 		   &seed,
+ 		   Min(sizeof(global_random_state),
+ 			   sizeof(seed))
+ 		);
+ }
+ 
+ double geqo_rand(PlannerInfo *root){
+ 	if(!((GeqoPrivateData*)root->join_search_private)->random_state)
+ 		return erand48(global_random_state);
+ 	return erand48(((GeqoPrivateData*)root->join_search_private)->random_state);
+ };
diff --git a/src/backend/optimizer/geqo/geqo_recombination.c b/src/backend/optimizer/geqo/geqo_recombination.c
index 3972fb1..e2b69a3 100644
*** a/src/backend/optimizer/geqo/geqo_recombination.c
--- b/src/backend/optimizer/geqo/geqo_recombination.c
***************
*** 20,25 ****
--- 20,26 ----
  
  #include "postgres.h"
  
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_recombination.h"
  
***************
*** 35,41 ****
   *	 and the procedure repeated.
   */
  void
! init_tour(Gene *tour, int num_gene)
  {
  	Gene	   *tmp;
  	int			remainder;
--- 36,42 ----
   *	 and the procedure repeated.
   */
  void
! init_tour(PlannerInfo *root, Gene *tour, int num_gene)
  {
  	Gene	   *tmp;
  	int			remainder;
*************** init_tour(Gene *tour, int num_gene)
*** 53,59 ****
  	for (i = 0; i < num_gene; i++)
  	{
  		/* choose value between 0 and remainder inclusive */
! 		next = (int) geqo_randint(remainder, 0);
  		/* output that element of the tmp array */
  		tour[i] = tmp[next];
  		/* and delete it */
--- 54,60 ----
  	for (i = 0; i < num_gene; i++)
  	{
  		/* choose value between 0 and remainder inclusive */
! 		next = (int) geqo_randint(root, remainder, 0);
  		/* output that element of the tmp array */
  		tour[i] = tmp[next];
  		/* and delete it */
*************** init_tour(Gene *tour, int num_gene)
*** 81,87 ****
   *	 allocate memory for city table
   */
  City *
! alloc_city_table(int num_gene)
  {
  	City	   *city_table;
  
--- 82,88 ----
   *	 allocate memory for city table
   */
  City *
! alloc_city_table(PlannerInfo *root, int num_gene)
  {
  	City	   *city_table;
  
*************** alloc_city_table(int num_gene)
*** 99,105 ****
   *	  deallocate memory of city table
   */
  void
! free_city_table(City *city_table)
  {
  	pfree(city_table);
  }
--- 100,106 ----
   *	  deallocate memory of city table
   */
  void
! free_city_table(PlannerInfo *root, City *city_table)
  {
  	pfree(city_table);
  }
diff --git a/src/backend/optimizer/geqo/geqo_selection.c b/src/backend/optimizer/geqo/geqo_selection.c
index d4f2c4d..2cd7402 100644
*** a/src/backend/optimizer/geqo/geqo_selection.c
--- b/src/backend/optimizer/geqo/geqo_selection.c
***************
*** 42,48 ****
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_selection.h"
  
! static int	linear(int max, double bias);
  
  
  /*
--- 42,48 ----
  #include "optimizer/geqo_random.h"
  #include "optimizer/geqo_selection.h"
  
! static int	linear(PlannerInfo *root, int max, double bias);
  
  
  /*
*************** static int	linear(int max, double bias);
*** 51,72 ****
   *	 first and second genes are selected from the pool
   */
  void
! geqo_selection(Chromosome *momma, Chromosome *daddy, Pool *pool, double bias)
  {
  	int			first,
  				second;
  
! 	first = linear(pool->size, bias);
! 	second = linear(pool->size, bias);
  
  	if (pool->size > 1)
  	{
  		while (first == second)
! 			second = linear(pool->size, bias);
  	}
  
! 	geqo_copy(momma, &pool->data[first], pool->string_length);
! 	geqo_copy(daddy, &pool->data[second], pool->string_length);
  }
  
  /*
--- 51,73 ----
   *	 first and second genes are selected from the pool
   */
  void
! geqo_selection(PlannerInfo *root, Chromosome *momma, Chromosome *daddy,
! 			   Pool *pool, double bias)
  {
  	int			first,
  				second;
  
! 	first = linear(root, pool->size, bias);
! 	second = linear(root, pool->size, bias);
  
  	if (pool->size > 1)
  	{
  		while (first == second)
! 			second = linear(root, pool->size, bias);
  	}
  
! 	geqo_copy(root, momma, &pool->data[first], pool->string_length);
! 	geqo_copy(root, daddy, &pool->data[second], pool->string_length);
  }
  
  /*
*************** geqo_selection(Chromosome *momma, Chromo
*** 78,84 ****
   *			 bias = (prob of first rule) / (prob of middle rule)
   */
  static int
! linear(int pool_size, double bias)		/* bias is y-intercept of linear
  										 * distribution */
  {
  	double		index;			/* index between 0 and pop_size */
--- 79,85 ----
   *			 bias = (prob of first rule) / (prob of middle rule)
   */
  static int
! linear(PlannerInfo *root, int pool_size, double bias)		/* bias is y-intercept of linear
  										 * distribution */
  {
  	double		index;			/* index between 0 and pop_size */
*************** linear(int pool_size, double bias)		/* b
*** 95,101 ****
  	{
  		double		sqrtval;
  
! 		sqrtval = (bias * bias) - 4.0 * (bias - 1.0) * geqo_rand();
  		if (sqrtval > 0.0)
  			sqrtval = sqrt(sqrtval);
  		index = max * (bias - sqrtval) / 2.0 / (bias - 1.0);
--- 96,102 ----
  	{
  		double		sqrtval;
  
! 		sqrtval = (bias * bias) - 4.0 * (bias - 1.0) * geqo_rand(root);
  		if (sqrtval > 0.0)
  			sqrtval = sqrt(sqrtval);
  		index = max * (bias - sqrtval) / 2.0 / (bias - 1.0);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index b375902..777d75d 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*************** standard_join_search(PlannerInfo *root, 
*** 901,906 ****
--- 901,908 ----
  	int			lev;
  	RelOptInfo *rel;
  
+ 	Assert(root->join_search_private == 0);
+ 
  	/*
  	 * We employ a simple "dynamic programming" algorithm: we first find all
  	 * ways to build joins of two jointree items, then all ways to build joins
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 2430185..4126516 100644
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*************** static struct config_real ConfigureNames
*** 2026,2031 ****
--- 2026,2040 ----
  		DEFAULT_GEQO_SELECTION_BIAS, MIN_GEQO_SELECTION_BIAS,
  		MAX_GEQO_SELECTION_BIAS, NULL, NULL
  	},
+ 	{
+ 		{"geqo_seed", PGC_USERSET, QUERY_TUNING_GEQO,
+ 			gettext_noop("GEQO: seed for random path selection for repeatable plan generation."),
+ 			gettext_noop("If zero planning will not be repeatable"),
+ 			GUC_NOT_IN_SAMPLE
+ 		},
+ 		&Geqo_seed,
+ 		0, 0, 1, NULL, NULL
+ 	},
  
  	{
  		{"bgwriter_lru_multiplier", PGC_SIGHUP, RESOURCES,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index ea48889..8f82d71 100644
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
*************** typedef struct PlannerInfo
*** 192,197 ****
--- 192,200 ----
  	/* These fields are used only when hasRecursion is true: */
  	int			wt_param_id;	/* PARAM_EXEC ID for the work table */
  	struct Plan *non_recursive_plan;	/* plan for non-recursive term */
+ 
+ 	/* private data for join-order implementation, i.e. GEQO */
+ 	void		*join_search_private;
  } PlannerInfo;
  
  
diff --git a/src/include/optimizer/geqo.h b/src/include/optimizer/geqo.h
index ea4c812..7a9b6a4 100644
*** a/src/include/optimizer/geqo.h
--- b/src/include/optimizer/geqo.h
*************** extern int	Geqo_pool_size;		/* 2 .. inf,
*** 59,88 ****
  extern int	Geqo_generations;	/* 1 .. inf, or 0 to use default */
  
  extern double Geqo_selection_bias;
  
  #define DEFAULT_GEQO_SELECTION_BIAS 2.0
  #define MIN_GEQO_SELECTION_BIAS 1.5
  #define MAX_GEQO_SELECTION_BIAS 2.0
  
  
- /*
-  * Data structure to encapsulate information needed for building plan trees
-  * (i.e., geqo_eval and gimme_tree).
-  */
  typedef struct
  {
! 	PlannerInfo *root;			/* the query we are planning */
! 	List	   *initial_rels;	/* the base relations */
! } GeqoEvalData;
! 
  
  /* routines in geqo_main.c */
  extern RelOptInfo *geqo(PlannerInfo *root,
  	 int number_of_rels, List *initial_rels);
  
  /* routines in geqo_eval.c */
! extern Cost geqo_eval(Gene *tour, int num_gene, GeqoEvalData *evaldata);
! extern RelOptInfo *gimme_tree(Gene *tour, int num_gene,
! 		   GeqoEvalData *evaldata);
  
  #endif   /* GEQO_H */
--- 59,83 ----
  extern int	Geqo_generations;	/* 1 .. inf, or 0 to use default */
  
  extern double Geqo_selection_bias;
+ extern double Geqo_seed;
  
  #define DEFAULT_GEQO_SELECTION_BIAS 2.0
  #define MIN_GEQO_SELECTION_BIAS 1.5
  #define MAX_GEQO_SELECTION_BIAS 2.0
  
  
  typedef struct
  {
! 	List *initial_rels;
! 	unsigned short *random_state;
! } GeqoPrivateData;
  
  /* routines in geqo_main.c */
  extern RelOptInfo *geqo(PlannerInfo *root,
  	 int number_of_rels, List *initial_rels);
  
  /* routines in geqo_eval.c */
! extern Cost geqo_eval(PlannerInfo *root, Gene *tour, int num_geneo);
! extern RelOptInfo *gimme_tree(PlannerInfo *root, Gene *tour, int num_gene);
  
  #endif   /* GEQO_H */
diff --git a/src/include/optimizer/geqo_copy.h b/src/include/optimizer/geqo_copy.h
index ae13059..63b7c83 100644
*** a/src/include/optimizer/geqo_copy.h
--- b/src/include/optimizer/geqo_copy.h
***************
*** 22,29 ****
  #ifndef GEQO_COPY_H
  #define GEQO_COPY_H
  
  #include "optimizer/geqo_gene.h"
  
! extern void geqo_copy(Chromosome *chromo1, Chromosome *chromo2, int string_length);
  
  #endif   /* GEQO_COPY_H */
--- 22,30 ----
  #ifndef GEQO_COPY_H
  #define GEQO_COPY_H
  
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_gene.h"
  
! extern void geqo_copy(PlannerInfo *root, Chromosome *chromo1, Chromosome *chromo2, int string_length);
  
  #endif   /* GEQO_COPY_H */
diff --git a/src/include/optimizer/geqo_mutation.h b/src/include/optimizer/geqo_mutation.h
index 0384de8..3ada430 100644
*** a/src/include/optimizer/geqo_mutation.h
--- b/src/include/optimizer/geqo_mutation.h
***************
*** 22,29 ****
  #ifndef GEQO_MUTATION_H
  #define GEQO_MUTATION_H
  
  #include "optimizer/geqo_gene.h"
  
! extern void geqo_mutation(Gene *tour, int num_gene);
  
  #endif   /* GEQO_MUTATION_H */
--- 22,30 ----
  #ifndef GEQO_MUTATION_H
  #define GEQO_MUTATION_H
  
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_gene.h"
  
! extern void geqo_mutation(PlannerInfo *root, Gene *tour, int num_gene);
  
  #endif   /* GEQO_MUTATION_H */
diff --git a/src/include/optimizer/geqo_pool.h b/src/include/optimizer/geqo_pool.h
index 950e667..dbc6a01 100644
*** a/src/include/optimizer/geqo_pool.h
--- b/src/include/optimizer/geqo_pool.h
***************
*** 26,40 ****
  #include "optimizer/geqo.h"
  
  
! extern Pool *alloc_pool(int pool_size, int string_length);
! extern void free_pool(Pool *pool);
  
! extern void random_init_pool(Pool *pool, GeqoEvalData *evaldata);
! extern Chromosome *alloc_chromo(int string_length);
! extern void free_chromo(Chromosome *chromo);
  
! extern void spread_chromo(Chromosome *chromo, Pool *pool);
  
! extern void sort_pool(Pool *pool);
  
  #endif   /* GEQO_POOL_H */
--- 26,40 ----
  #include "optimizer/geqo.h"
  
  
! extern Pool *alloc_pool(PlannerInfo *root, int pool_size, int string_length);
! extern void free_pool(PlannerInfo *root, Pool *pool);
  
! extern void random_init_pool(PlannerInfo *root, Pool *pool);
! extern Chromosome *alloc_chromo(PlannerInfo *root, int string_length);
! extern void free_chromo(PlannerInfo *root, Chromosome *chromo);
  
! extern void spread_chromo(PlannerInfo *root, Chromosome *chromo, Pool *pool);
  
! extern void sort_pool(PlannerInfo *root, Pool *pool);
  
  #endif   /* GEQO_POOL_H */
diff --git a/src/include/optimizer/geqo_random.h b/src/include/optimizer/geqo_random.h
index fab0728..6fcfa7f 100644
*** a/src/include/optimizer/geqo_random.h
--- b/src/include/optimizer/geqo_random.h
***************
*** 27,38 ****
  #include <math.h>
  
  /* geqo_rand returns a random float value between 0 and 1 inclusive */
! 
! #define geqo_rand() ((double) random() / (double) MAX_RANDOM_VALUE)
  
  /* geqo_randint returns integer value between lower and upper inclusive */
! 
! #define geqo_randint(upper,lower) \
! 	( (int) floor( geqo_rand()*(((upper)-(lower))+0.999999) ) + (lower) )
  
  #endif   /* GEQO_RANDOM_H */
--- 27,37 ----
  #include <math.h>
  
  /* geqo_rand returns a random float value between 0 and 1 inclusive */
! double geqo_rand(PlannerInfo *root);
! void geqo_seed(PlannerInfo *root, double);
  
  /* geqo_randint returns integer value between lower and upper inclusive */
! #define geqo_randint(root, upper,lower)						\
! 	( (int) floor( geqo_rand(root)*(((upper)-(lower))+0.999999) ) + (lower) )
  
  #endif   /* GEQO_RANDOM_H */
diff --git a/src/include/optimizer/geqo_recombination.h b/src/include/optimizer/geqo_recombination.h
index 2c4a338..2484259 100644
*** a/src/include/optimizer/geqo_recombination.h
--- b/src/include/optimizer/geqo_recombination.h
***************
*** 24,32 ****
  #ifndef GEQO_RECOMBINATION_H
  #define GEQO_RECOMBINATION_H
  
  #include "optimizer/geqo_gene.h"
  
! extern void init_tour(Gene *tour, int num_gene);
  
  
  /* edge recombination crossover [ERX] */
--- 24,33 ----
  #ifndef GEQO_RECOMBINATION_H
  #define GEQO_RECOMBINATION_H
  
+ #include "optimizer/geqo.h"
  #include "optimizer/geqo_gene.h"
  
! extern void init_tour(PlannerInfo *root, Gene *tour, int num_gene);
  
  
  /* edge recombination crossover [ERX] */
*************** typedef struct Edge
*** 38,49 ****
  	int			unused_edges;
  } Edge;
  
! extern Edge *alloc_edge_table(int num_gene);
! extern void free_edge_table(Edge *edge_table);
  
! extern float gimme_edge_table(Gene *tour1, Gene *tour2, int num_gene, Edge *edge_table);
  
! extern int	gimme_tour(Edge *edge_table, Gene *new_gene, int num_gene);
  
  
  /* partially matched crossover [PMX] */
--- 39,52 ----
  	int			unused_edges;
  } Edge;
  
! extern Edge *alloc_edge_table(PlannerInfo *root, int num_gene);
! extern void free_edge_table(PlannerInfo *root, Edge *edge_table);
  
! extern float gimme_edge_table(PlannerInfo *root, Gene *tour1, Gene *tour2,
! 							  int num_gene, Edge *edge_table);
  
! extern int	gimme_tour(PlannerInfo *root, Edge *edge_table, Gene *new_gene,
! 					   int num_gene);
  
  
  /* partially matched crossover [PMX] */
*************** extern int	gimme_tour(Edge *edge_table, 
*** 51,57 ****
  #define DAD 1					/* indicator for gene from dad */
  #define MOM 0					/* indicator for gene from mom */
  
! extern void pmx(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene);
  
  
  typedef struct City
--- 54,62 ----
  #define DAD 1					/* indicator for gene from dad */
  #define MOM 0					/* indicator for gene from mom */
  
! extern void pmx(PlannerInfo *root,
! 				Gene *tour1, Gene *tour2,
! 				Gene *offspring, int num_gene);
  
  
  typedef struct City
*************** typedef struct City
*** 62,80 ****
  	int			select_list;
  } City;
  
! extern City *alloc_city_table(int num_gene);
! extern void free_city_table(City *city_table);
  
  /* cycle crossover [CX] */
! extern int	cx(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table);
  
  /* position crossover [PX] */
! extern void px(Gene *tour1, Gene *tour2, Gene *offspring, int num_gene, City *city_table);
  
  /* order crossover [OX1] according to Davis */
! extern void ox1(Gene *mom, Gene *dad, Gene *offspring, int num_gene, City *city_table);
  
  /* order crossover [OX2] according to Syswerda */
! extern void ox2(Gene *mom, Gene *dad, Gene *offspring, int num_gene, City *city_table);
  
  #endif   /* GEQO_RECOMBINATION_H */
--- 67,89 ----
  	int			select_list;
  } City;
  
! extern City *alloc_city_table(PlannerInfo *root, int num_gene);
! extern void free_city_table(PlannerInfo *root, City *city_table);
  
  /* cycle crossover [CX] */
! extern int	cx(PlannerInfo *root, Gene *tour1, Gene *tour2,
! 			   Gene *offspring, int num_gene, City *city_table);
  
  /* position crossover [PX] */
! extern void px(PlannerInfo *root, Gene *tour1, Gene *tour2, Gene *offspring,
! 			   int num_gene, City *city_table);
  
  /* order crossover [OX1] according to Davis */
! extern void ox1(PlannerInfo *root, Gene *mom, Gene *dad, Gene *offspring,
! 				int num_gene, City *city_table);
  
  /* order crossover [OX2] according to Syswerda */
! extern void ox2(PlannerInfo *root, Gene *mom, Gene *dad, Gene *offspring,
! 				int num_gene, City *city_table);
  
  #endif   /* GEQO_RECOMBINATION_H */
diff --git a/src/include/optimizer/geqo_selection.h b/src/include/optimizer/geqo_selection.h
index 4b336d1..7e93cb4 100644
*** a/src/include/optimizer/geqo_selection.h
--- b/src/include/optimizer/geqo_selection.h
***************
*** 25,30 ****
  
  #include "optimizer/geqo_gene.h"
  
! extern void geqo_selection(Chromosome *momma, Chromosome *daddy, Pool *pool, double bias);
  
  #endif   /* GEQO_SELECTION_H */
--- 25,32 ----
  
  #include "optimizer/geqo_gene.h"
  
! extern void geqo_selection(PlannerInfo *root,
! 						   Chromosome *momma, Chromosome *daddy,
! 						   Pool *pool, double bias);
  
  #endif   /* GEQO_SELECTION_H */
-- 
1.6.3.3.335.ge09a8

#75

marcin mank

marcin.mank@gmail.com

over 16 years ago

In reply to: Noah Misch (#40)

Re: *_collapse_limit, geqo_threshold

On Thu, Jul 9, 2009 at 5:38 AM, Noah Misch<noah@leadboat.com> wrote:
z

Describing in those terms illuminates much. While the concepts do suggest 2^N
worst-case planning cost, my artificial test case showed a rigid 4^N pattern;
what could explain that?

Isn`t that just so that the planner has to examine O(2^N) subsets of
relations, and do O(2^N) work for each of them? To create level N join
the planner chooses pairs of level k and level N-k joins. the count of
level k joins is O(2^k), the count of level N-k ones is O(2^(N-k)).
Together it is O(N) * O(2^N) * O(2^k) * O(2^(N-k)) which is O(N* 4^N)
.

This is for the worst case. If we could make a better estimate of the
required planning time (I believe that the input data for a good
heuristic is a matrix which says which relation is constrained to
which relation), we could make better decisions about when to flatten
subqueries, collapse joins, launch geqo...

Greetings
Marcin