GSoC idea - Simulated annealing to search for query plans

Started by Grzegorz Parkaalmost 11 years ago11 messages
#1Grzegorz Parka
grzegorz.parka@gmail.com

Dear Hackers,

I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
Computer Science at WUT, Poland. Last year I've been a bit into
evolutionary algorithms and during my research I found out about GEQO in
Postgres. I also found out that there are plans to try a different attempt
to find optimal query plans and thought it could be a good thing to use it
as a project for GSoC.

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'. I believe this
would be potentially beneficial to Postgres to check if such a heuristic
could really choose better query plans than GEQO does. Judging by the
results that simulated annealing gives on the travelling salesman problem,
it looks like a simpler and potentially more effective way of combinatorial
optimization.

As deliverables of such a project I would provide a simple implementation
of basic simulated annealing optimizer and some form of quantitative
comparison with GEQO.

I see that this may be considerably bigger than most of GSoC projects, but
I would like to know your opinion. Do you think that this would be
beneficial enough to make a proper GSoC project? I would also like to know
if you have any additional ideas about this project.

Best regards,
Grzegorz Parka

#2Josh Berkus
josh@agliodbs.com
In reply to: Grzegorz Parka (#1)
Re: GSoC idea - Simulated annealing to search for query plans

On 02/26/2015 01:59 PM, Grzegorz Parka wrote:

Dear Hackers,

I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
Computer Science at WUT, Poland. Last year I've been a bit into
evolutionary algorithms and during my research I found out about GEQO in
Postgres. I also found out that there are plans to try a different
attempt to find optimal query plans and thought it could be a good thing
to use it as a project for GSoC.

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'. I believe
this would be potentially beneficial to Postgres to check if such a
heuristic could really choose better query plans than GEQO does. Judging
by the results that simulated annealing gives on the travelling salesman
problem, it looks like a simpler and potentially more effective way of
combinatorial optimization.

As deliverables of such a project I would provide a simple
implementation of basic simulated annealing optimizer and some form of
quantitative comparison with GEQO.

You might look at the earlier attempt to make the GEQO replacement
"pluggable". That project failed to complete sufficiently to be a
feature though, but it did enough to show that our current GEQO
implementation was suboptimal.

I'm currently searching for this project ... it was a GSOC project, but
I think before they required posting to Google Code.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#2)
Re: GSoC idea - Simulated annealing to search for query plans

Josh Berkus <josh@agliodbs.com> writes:

On 02/26/2015 01:59 PM, Grzegorz Parka wrote:

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'.

You might look at the earlier attempt to make the GEQO replacement
"pluggable". That project failed to complete sufficiently to be a
feature though, but it did enough to show that our current GEQO
implementation was suboptimal.

I'm currently searching for this project ... it was a GSOC project, but
I think before they required posting to Google Code.

I seem to recall somebody demo'ing a simulated-annealing GEQO replacement
at PGCon a couple years back. It never got to the point of being a
submitted patch though.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: GSoC idea - Simulated annealing to search for query plans

On 2015-02-26 20:23:33 -0500, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

On 02/26/2015 01:59 PM, Grzegorz Parka wrote:

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'.

You might look at the earlier attempt to make the GEQO replacement
"pluggable". That project failed to complete sufficiently to be a
feature though, but it did enough to show that our current GEQO
implementation was suboptimal.

I'm currently searching for this project ... it was a GSOC project, but
I think before they required posting to Google Code.

I seem to recall somebody demo'ing a simulated-annealing GEQO replacement
at PGCon a couple years back. It never got to the point of being a
submitted patch though.

Yea, it was Jan Urbański (CCed).

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Fabrízio de Royes Mello
fabriziomello@gmail.com
In reply to: Andres Freund (#4)
Re: GSoC idea - Simulated annealing to search for query plans

On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2015-02-26 20:23:33 -0500, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

On 02/26/2015 01:59 PM, Grzegorz Parka wrote:

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'.

You might look at the earlier attempt to make the GEQO replacement
"pluggable". That project failed to complete sufficiently to be a
feature though, but it did enough to show that our current GEQO
implementation was suboptimal.

I'm currently searching for this project ... it was a GSOC project,

but

I think before they required posting to Google Code.

I seem to recall somebody demo'ing a simulated-annealing GEQO

replacement

at PGCon a couple years back. It never got to the point of being a
submitted patch though.

Yea, it was Jan Urbański (CCed).

And the project link: https://github.com/wulczer/saio

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

#6Josh Berkus
josh@agliodbs.com
In reply to: Grzegorz Parka (#1)
Re: GSoC idea - Simulated annealing to search for query plans

On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:

On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
<mailto:andres@2ndquadrant.com>> wrote:

On 2015-02-26 20:23:33 -0500, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>> writes:

On 02/26/2015 01:59 PM, Grzegorz Parka wrote:

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'.

You might look at the earlier attempt to make the GEQO replacement
"pluggable". That project failed to complete sufficiently to be a
feature though, but it did enough to show that our current GEQO
implementation was suboptimal.

I'm currently searching for this project ... it was a GSOC

project, but

I think before they required posting to Google Code.

I seem to recall somebody demo'ing a simulated-annealing GEQO

replacement

at PGCon a couple years back. It never got to the point of being a
submitted patch though.

Yea, it was Jan Urbański (CCed).

And the project link: https://github.com/wulczer/saio

So what w'ere saying, Grzegorz, is that we would love to see someone
pick this up and get it to the point of making it a feature as a GSOC
project. I think if you can start from where Jan left off, you could
actually complete it.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7ktm@rice.edu
ktm@rice.edu
In reply to: Grzegorz Parka (#1)
Re: GSoC idea - Simulated annealing to search for query plans

On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote:

Dear Hackers,

I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
Computer Science at WUT, Poland. Last year I've been a bit into
evolutionary algorithms and during my research I found out about GEQO in
Postgres. I also found out that there are plans to try a different attempt
to find optimal query plans and thought it could be a good thing to use it
as a project for GSoC.

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'. I believe this
would be potentially beneficial to Postgres to check if such a heuristic
could really choose better query plans than GEQO does. Judging by the
results that simulated annealing gives on the travelling salesman problem,
it looks like a simpler and potentially more effective way of combinatorial
optimization.

As deliverables of such a project I would provide a simple implementation
of basic simulated annealing optimizer and some form of quantitative
comparison with GEQO.

I see that this may be considerably bigger than most of GSoC projects, but
I would like to know your opinion. Do you think that this would be
beneficial enough to make a proper GSoC project? I would also like to know
if you have any additional ideas about this project.

Best regards,
Grzegorz Parka

Hi,

I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.

Regards,
Ken

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Jan Urbański
wulczer@wulczer.org
In reply to: Josh Berkus (#6)
Re: GSoC idea - Simulated annealing to search for query plans

Josh Berkus writes:

On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:

On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
<mailto:andres@2ndquadrant.com>> wrote:

On 2015-02-26 20:23:33 -0500, Tom Lane wrote:

I seem to recall somebody demo'ing a simulated-annealing GEQO

replacement

at PGCon a couple years back. It never got to the point of being a
submitted patch though.

Yea, it was Jan Urbański (CCed).

And the project link: https://github.com/wulczer/saio

So what w'ere saying, Grzegorz, is that we would love to see someone
pick this up and get it to the point of making it a feature as a GSOC
project. I think if you can start from where Jan left off, you could
actually complete it.

Sorry, late to the party.

Yes, I wrote a GEQO replacement that used simulated annealing for my Master
thesis. It got to a point where it was generating plans similar to what GEQO
outputs for small queries and better plans for very large ones.

The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub
repo contains source for the PGCon presentation, which gives a higher-level
overview.

The big problem turned out to be writing the step function that generates a new
join order from a previous one. Look for the "Simulated Annealing challenges"
and "Moves generator" chapters in my thesis, which are the only interesting
ones :)

If you'd like to pick up where I left, I'd be more than happy to help in any
ways I can.

Best,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Grzegorz Parka
grzegorz.parka@gmail.com
In reply to: Jan Urbański (#8)
Re: GSoC idea - Simulated annealing to search for query plans

Thank you a lot for your feedback. I searched a lot about GEQO,
but I didn't find information about any earlier attempts.
I'm happy to know that this is important for Postgres.

I'm really interested in this project, so I just need to estimate if I can
handle it.
Now I will spend some time with SAIO and GEQO to find it out.

Best,
Grzegorz

2015-02-27 16:29 GMT+01:00 Jan Urbański <wulczer@wulczer.org>:

Show quoted text

Josh Berkus writes:

On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:

On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
<mailto:andres@2ndquadrant.com>> wrote:

On 2015-02-26 20:23:33 -0500, Tom Lane wrote:

I seem to recall somebody demo'ing a simulated-annealing GEQO

replacement

at PGCon a couple years back. It never got to the point of being a
submitted patch though.

Yea, it was Jan Urbański (CCed).

And the project link: https://github.com/wulczer/saio

So what w'ere saying, Grzegorz, is that we would love to see someone
pick this up and get it to the point of making it a feature as a GSOC
project. I think if you can start from where Jan left off, you could
actually complete it.

Sorry, late to the party.

Yes, I wrote a GEQO replacement that used simulated annealing for my Master
thesis. It got to a point where it was generating plans similar to what
GEQO
outputs for small queries and better plans for very large ones.

The thesis itself is here: https://wulczer.org/saio.pdf and the linked
GitHub
repo contains source for the PGCon presentation, which gives a higher-level
overview.

The big problem turned out to be writing the step function that generates
a new
join order from a previous one. Look for the "Simulated Annealing
challenges"
and "Moves generator" chapters in my thesis, which are the only interesting
ones :)

If you'd like to pick up where I left, I'd be more than happy to help in
any
ways I can.

Best,
Jan

#10Grzegorz Parka
grzegorz.parka@gmail.com
In reply to: ktm@rice.edu (#7)
Re: GSoC idea - Simulated annealing to search for query plans

I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.

Does it mean just to make the join order optimizer pluggable? If yes then
it is already pluggable as an extension. Is this the desired approach so
far?
I'm thinking of testing and improving SAIO as an extension before reaching
a satisfactory quality of code and returned plans.

Would then the destination be the /contrib and then main source tree or
would it ever stay as an extension?

Best regards,
Grzegorz

2015-02-27 15:03 GMT+01:00 ktm@rice.edu <ktm@rice.edu>:

Show quoted text

On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote:

Dear Hackers,

I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
Computer Science at WUT, Poland. Last year I've been a bit into
evolutionary algorithms and during my research I found out about GEQO in
Postgres. I also found out that there are plans to try a different

attempt

to find optimal query plans and thought it could be a good thing to use

it

as a project for GSoC.

I'm interested in one of old TODO items related to the optimizer -
'Consider compressed annealing to search for query plans'. I believe this
would be potentially beneficial to Postgres to check if such a heuristic
could really choose better query plans than GEQO does. Judging by the
results that simulated annealing gives on the travelling salesman

problem,

it looks like a simpler and potentially more effective way of

combinatorial

optimization.

As deliverables of such a project I would provide a simple implementation
of basic simulated annealing optimizer and some form of quantitative
comparison with GEQO.

I see that this may be considerably bigger than most of GSoC projects,

but

I would like to know your opinion. Do you think that this would be
beneficial enough to make a proper GSoC project? I would also like to

know

if you have any additional ideas about this project.

Best regards,
Grzegorz Parka

Hi,

I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.

Regards,
Ken

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Grzegorz Parka (#10)
Re: GSoC idea - Simulated annealing to search for query plans

Grzegorz Parka <grzegorz.parka@gmail.com> writes:

I'm thinking of testing and improving SAIO as an extension before reaching
a satisfactory quality of code and returned plans.

Would then the destination be the /contrib and then main source tree or
would it ever stay as an extension?

I'd like to push it into the main tree, replacing GEQO, if as we suspect
it turns out to dominate GEQO on speed/plan quality. There's no reason
not to test it in the manner you describe though.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers