SDP query optimizer

Started by Adriano Langealmost 13 years ago7 messages
#1Adriano Lange
alange0001@gmail.com
5 attachment(s)

Hi all,

I have developed a new query optimizer for PostgreSQL and I would like
to share it with the community. The optimizer's name is Sampling and
Dynamic Programming (SDP). I put it into a plugin developed some years
ago, named LJQO:

https://github.com/alange0001/ljqo.git

This plugin was configured to compile only against PostgreSQL 9.2.
However, I guess it may be easily adjusted for other versions of PostgreSQL.

I would be glad for any feedback about SDP or even about LJQO.

I have some numbers about the SDP in comparison with GEQO. If
interested, see a diff between the two ".out2" files attached. The
schema and query are from a previous email posted by Andres Freund in
this list.

Thanks for attention.
Adriano Lange

Attachments:

query_2_geqo.out2text/plain; charset=UTF-8; name=query_2_geqo.out2Download
query_2_sdp_1.out2text/plain; charset=UTF-8; name=query_2_sdp_1.out2Download
query_2.sqltext/x-sql; name=query_2.sqlDownload
schema.sqltext/x-sql; name=schema.sqlDownload
views.sqltext/x-sql; name=views.sqlDownload
#2Josh Berkus
josh@agliodbs.com
In reply to: Adriano Lange (#1)
Re: SDP query optimizer

Adriano,

I have developed a new query optimizer for PostgreSQL and I would like
to share it with the community. The optimizer's name is Sampling and
Dynamic Programming (SDP). I put it into a plugin developed some years
ago, named LJQO:

Woah! Way cool.

As a warning, we're in the closing throes of version 9.3 right now, so
if you code/ideas doesn't get the attention it deserves, that's why.

There is an incomplete project from a few years back to make the
non-exhaustive query planner pluggable so that we could use different
algorithms. Unfortunately, it was never finished and merged with the
core code. Your planner is yet another reason it would be great to
complete this.

Keep up the good work!

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Ants Aasma
ants@cybertec.at
In reply to: Adriano Lange (#1)
Re: SDP query optimizer

On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001@gmail.com> wrote:

I have developed a new query optimizer for PostgreSQL and I would like to
share it with the community. The optimizer's name is Sampling and Dynamic
Programming (SDP). I put it into a plugin developed some years ago, named
LJQO:

https://github.com/alange0001/ljqo.git

Looks great. Do you happen to have any papers or insight into why SDP
works as well as it does? It isn't immediately clear to me why the
cheapest left-deep tree is a good heuristic start point for the
dynamic programming phase.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Adriano Lange
alange0001@gmail.com
In reply to: Josh Berkus (#2)
Re: SDP query optimizer

Hi,

On 22-03-2013 21:22, Josh Berkus wrote:

Woah! Way cool.

As a warning, we're in the closing throes of version 9.3 right now, so
if you code/ideas doesn't get the attention it deserves, that's why.

Ok. No problem. :-)

There is an incomplete project from a few years back to make the
non-exhaustive query planner pluggable so that we could use different
algorithms. Unfortunately, it was never finished and merged with the
core code. Your planner is yet another reason it would be great to
complete this.

Yes. I looked at the Julius and Tomas' project in pgFoundry [1]http://pgfoundry.org/projects/optimizer/ some
years ago, but it was inactive. Therefore, I decided to start a new one.

[1]: http://pgfoundry.org/projects/optimizer/

Anyway, good work for all of you.

--
Adriano Lange

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Adriano Lange
alange0001@gmail.com
In reply to: Ants Aasma (#3)
2 attachment(s)
Re: SDP query optimizer

On 22-03-2013 21:46, Ants Aasma wrote:

On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001@gmail.com> wrote:

I have developed a new query optimizer for PostgreSQL and I would like to
share it with the community. The optimizer's name is Sampling and Dynamic
Programming (SDP). I put it into a plugin developed some years ago, named
LJQO:

https://github.com/alange0001/ljqo.git

Looks great. Do you happen to have any papers or insight into why SDP
works as well as it does? It isn't immediately clear to me why the
cheapest left-deep tree is a good heuristic start point for the
dynamic programming phase.

Hi Ants Aasma,

I think it is difficult to say that SDP is completely better than GEQO.
There is a huge variety of situations where these kinds of optimizers
can be used.

I have made a series of experiments on SDP in a separate environment.
The main objective was to parallelize the optimizer. Thus, I called it
PSDP. I developed this environment from scratch in c++ and I used a
simplified version of the cost model used by PostgreSQL. Therefore, the
SDP is a sequential version of an experimental parallel optimizer.

For PSDP in its experimental environment, I have a comparison between
the costs obtained by the sampling phase and the costs obtained by the
complete PSDP. See the figure grafico-scaled_costs-dp_psdp attached. I
think that the dynamic programming phase can bring some robustness to
the optimizer. However, I still need to perform these same experiments
on PostgreSQL.

Considering the heuristic used in sampling phase, I have made another
series of experiments on PostgreSQL 9.0 comparing this sampling
(Sampling-L) with both a sampling on bushy trees (Sampling-B) and the
sampling used by GEQO. See as example the file "sampling...eps" also
attached. By these experiments I observed that Sampling-L was
statistically more efficient to get good join orders in a short time.

Regards,
Adriano Lange

Attachments:

grafico-scaled_costs-dp_psdp.epsimage/x-eps; name=grafico-scaled_costs-dp_psdp.epsDownload
sampling:chain,100,12,14,2:freq.epsimage/x-eps; name="sampling:chain,100,12,14,2:freq.eps"Download
#6Andres Freund
andres@2ndquadrant.com
In reply to: Adriano Lange (#1)
Re: SDP query optimizer

On 2013-03-22 20:35:43 -0300, Adriano Lange wrote:

Hi all,

I have developed a new query optimizer for PostgreSQL and I would like to
share it with the community. The optimizer's name is Sampling and Dynamic
Programming (SDP). I put it into a plugin developed some years ago, named
LJQO:

https://github.com/alange0001/ljqo.git

This plugin was configured to compile only against PostgreSQL 9.2. However,
I guess it may be easily adjusted for other versions of PostgreSQL.

I would be glad for any feedback about SDP or even about LJQO.

I have some numbers about the SDP in comparison with GEQO. If interested,
see a diff between the two ".out2" files attached. The schema and query are
from a previous email posted by Andres Freund in this list.

I just want to mention that unless you skew the statistics for the individual
tables from their empty/default state this mostly measures a pretty degenerate
case where optima are very rare and not very differentiated. Thats a useful
thing to test, but not to have as the target to optimize for.
So it might be interesting to run that thing with some table
stats/contents stats set up.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Adriano Lange
alange0001@gmail.com
In reply to: Andres Freund (#6)
Re: SDP query optimizer

On 23-03-2013 10:15, Andres Freund wrote:

I just want to mention that unless you skew the statistics for the individual
tables from their empty/default state this mostly measures a pretty degenerate
case where optima are very rare and not very differentiated. Thats a useful
thing to test, but not to have as the target to optimize for.
So it might be interesting to run that thing with some table
stats/contents stats set up.

Yes, the search space obtained from this experiment may be very simpler
than a real case. Beyond this experiment, I can construct classical
queries used to evaluate this kind of algorithm, as stars, cliques,
chains and cycles. Beyond these queries I have no idea how can I further
test it.

Regards,

Adriano Lange

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers