PREPARE / EXECUTE

Started by Hans-Jürgen Schönigabout 23 years ago14 messages
#1Hans-Jürgen Schönig
postgres@cybertec.at

First of all PREPARE/EXECUTE is a wonderful thing to speed up things
significantly.
I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

This might be useful when it comes to VERY complex queries (> 10 tables).
I many applications the situation is like that:

a. The user connects to the database.
b. The user sends various different queries to the server (some might be
the same)
c. The user disconnects.

If there was a way to store execution plans in a table the user could
load the execution plans of the most time consuming stuff into the
backend without parsing and optimizing it every time he authenticates.

Does it sound useful to anybody? Is it possible to do it or are there
some technical problems?

Maybe this is worth thinking about.

Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

#2Greg Copeland
greg@CopelandConsulting.Net
In reply to: Hans-Jürgen Schönig (#1)
Re: PREPARE / EXECUTE

If you were using them that frequently, couldn't you just keep a
persistent connection? If it's not used that often, wouldn't the
overhead of preparing the query following a new connection become noise?

Greg

Show quoted text

On Wed, 2002-10-23 at 09:24, Hans-Jürgen Schönig wrote:

First of all PREPARE/EXECUTE is a wonderful thing to speed up things
significantly.
I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

This might be useful when it comes to VERY complex queries (> 10 tables).
I many applications the situation is like that:

a. The user connects to the database.
b. The user sends various different queries to the server (some might be
the same)
c. The user disconnects.

If there was a way to store execution plans in a table the user could
load the execution plans of the most time consuming stuff into the
backend without parsing and optimizing it every time he authenticates.

Does it sound useful to anybody? Is it possible to do it or are there
some technical problems?

Maybe this is worth thinking about.

Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#3Rod Taylor
rbt@rbt.ca
In reply to: Greg Copeland (#2)
Re: PREPARE / EXECUTE

On Wed, 2002-10-23 at 10:39, Greg Copeland wrote:

If you were using them that frequently, couldn't you just keep a
persistent connection? If it's not used that often, wouldn't the
overhead of preparing the query following a new connection become noise?

Especially by the time you add in the dependency tracking (drop table,
query must go), and modifications to analyze to clear out the stored
list.

On Wed, 2002-10-23 at 09:24, Hans-Jürgen Schönig wrote:

First of all PREPARE/EXECUTE is a wonderful thing to speed up things
significantly.
I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

This might be useful when it comes to VERY complex queries (> 10 tables).
I many applications the situation is like that:

a. The user connects to the database.
b. The user sends various different queries to the server (some might be
the same)
c. The user disconnects.

If there was a way to store execution plans in a table the user could
load the execution plans of the most time consuming stuff into the
backend without parsing and optimizing it every time he authenticates.

Does it sound useful to anybody? Is it possible to do it or are there
some technical problems?

Maybe this is worth thinking about.

Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

--
Rod Taylor

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hans-Jürgen Schönig (#1)
Re: PREPARE / EXECUTE

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

The original version of the PREPARE patch used a shared-across-backends
cache for PREPAREd statements. We rejected that for a number of
reasons, one being the increased difficulty of keeping such a cache up
to date. I think actually storing the plans on disk would have all the
same problems, but worse.

regards, tom lane

#5Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Hans-Jürgen Schönig (#1)
Re: PREPARE / EXECUTE

The idea is not to have it accross multiple backends and having it in
sync with the tables in the database. This is not the point.
My problem is that I have seen many performance critical applications
sending just a few complex queries to the server. The problem is: If you
have many queries where the relation "time planner"/"time executor" is
very high (eg. complex joins with just one value as the result).
These applications stay the same for a long time (maybe even years) and
so there is no need to worry about new tables and so forth - maybe there
is not even a need to worry about new data. In these cases we could
speed up the database significantly just by avoiding the use of the planner:

An example:
I have a join across 10 tables + 2 subselects across 4 tables
on the machine I use for testing:
planner: 12 seconds
executor: 1 second

The application will stay the same forever.
I could be 10 times faster if there was a way to load the execution plan
into the backend.
There is no way to use a persistent connection (many clients on
different machines, dynamic IPs, etc. ...)
There is no way to have an "invalid" execution plan because there are no
changes (new tables etc.) in the database.

Also: If people execute a prepared query and it fails they will know why
- queries will fail if people drop a table even if these queries are not
prepared.
A new feature like the one we are discussing might be used rarely but if
people use it they will benefit A LOT.

If we had a simple ASCII interface to load the stuff into the planner
people could save MANY cycles.
When talking about tuning it is nice to gain 10% or even 20% but in many
cases it does not solve a problem - if a problem can be reduced by 90%
it is a REAL gain.
Gaining 10% can be done by tweaking the database a little - gaining
1000% cannot be done so it might be worth thinking about it even it the
feature is only used by 20% of those users out there. 20% of all
postgres users is most likely more than 15.000 people.

Again; it is not supposed to be a every-day solution. It is a solution
for applications staying the same for a very long time.

Hans

Tom Lane wrote:

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

The original version of the PREPARE patch used a shared-across-backends
cache for PREPAREd statements. We rejected that for a number of
reasons, one being the increased difficulty of keeping such a cache up
to date. I think actually storing the plans on disk would have all the
same problems, but worse.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hans-Jürgen Schönig (#1)
Re: PREPARE / EXECUTE

Bruno Wolff III <bruno@wolff.to> writes:

Hans-J�rgen Sch�nig <postgres@cybertec.at> wrote:

I have a join across 10 tables + 2 subselects across 4 tables
on the machine I use for testing:
planner: 12 seconds
executor: 1 second

One option you have is to explicitly give the join order.

Yes, this is exactly the sort of situation where forcing the join order
is a big performance win. See
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/explicit-joins.html

regards, tom lane

#7Bruno Wolff III
bruno@wolff.to
In reply to: Hans-Jürgen Schönig (#5)
Re: PREPARE / EXECUTE

On Wed, Oct 23, 2002 at 18:04:01 +0200,
Hans-J�rgen Sch�nig <postgres@cybertec.at> wrote:

An example:
I have a join across 10 tables + 2 subselects across 4 tables
on the machine I use for testing:
planner: 12 seconds
executor: 1 second

The application will stay the same forever.
I could be 10 times faster if there was a way to load the execution plan
into the backend.

One option you have is to explicitly give the join order. You can look at
explain to see what order the joins are done in and then rewrite the sql
to force them to be done in that order. This should keep things simple
for the planner.

#8Hans-Jürgen Schönig
postgres@cybertec.at
In reply to: Hans-Jürgen Schönig (#1)
Re: PREPARE / EXECUTE

This is exactly what we do in case of complex stuff. I know that it can
help to reduce the problem for the planner.
However: If you have explicit joins across 10 tables the SQL statement
is not that readable any more and it is still slower than a prepared
execution plan.

I guess it is worth thinking about prepared plans somewhere on disk.
Is there a way to transform ASCII -> plan?

Hans

Bruno Wolff III wrote:

On Wed, Oct 23, 2002 at 18:04:01 +0200,
Hans-J�rgen Sch�nig <postgres@cybertec.at> wrote:

An example:
I have a join across 10 tables + 2 subselects across 4 tables
on the machine I use for testing:
planner: 12 seconds
executor: 1 second

The application will stay the same forever.
I could be 10 times faster if there was a way to load the execution plan
into the backend.

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

#9Greg Copeland
greg@CopelandConsulting.Net
In reply to: Hans-Jürgen Schönig (#5)
Re: PREPARE / EXECUTE

Could you use some form of connection proxy where the proxy is actually
keeping persistent connections but your application is making transient
connections to the proxy? I believe this would result in the desired
performance boost and behavior.

Now, the next obvious question...anyone know of any proxy apps available
for postgresql?

Regards,

Greg

Show quoted text

On Wed, 2002-10-23 at 11:04, Hans-Jürgen Schönig wrote:

The idea is not to have it accross multiple backends and having it in
sync with the tables in the database. This is not the point.
My problem is that I have seen many performance critical applications
sending just a few complex queries to the server. The problem is: If you
have many queries where the relation "time planner"/"time executor" is
very high (eg. complex joins with just one value as the result).
These applications stay the same for a long time (maybe even years) and
so there is no need to worry about new tables and so forth - maybe there
is not even a need to worry about new data. In these cases we could
speed up the database significantly just by avoiding the use of the planner:

An example:
I have a join across 10 tables + 2 subselects across 4 tables
on the machine I use for testing:
planner: 12 seconds
executor: 1 second

The application will stay the same forever.
I could be 10 times faster if there was a way to load the execution plan
into the backend.
There is no way to use a persistent connection (many clients on
different machines, dynamic IPs, etc. ...)
There is no way to have an "invalid" execution plan because there are no
changes (new tables etc.) in the database.

Also: If people execute a prepared query and it fails they will know why
- queries will fail if people drop a table even if these queries are not
prepared.
A new feature like the one we are discussing might be used rarely but if
people use it they will benefit A LOT.

If we had a simple ASCII interface to load the stuff into the planner
people could save MANY cycles.
When talking about tuning it is nice to gain 10% or even 20% but in many
cases it does not solve a problem - if a problem can be reduced by 90%
it is a REAL gain.
Gaining 10% can be done by tweaking the database a little - gaining
1000% cannot be done so it might be worth thinking about it even it the
feature is only used by 20% of those users out there. 20% of all
postgres users is most likely more than 15.000 people.

Again; it is not supposed to be a every-day solution. It is a solution
for applications staying the same for a very long time.

Hans

Tom Lane wrote:

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

The original version of the PREPARE patch used a shared-across-backends
cache for PREPAREd statements. We rejected that for a number of
reasons, one being the increased difficulty of keeping such a cache up
to date. I think actually storing the plans on disk would have all the
same problems, but worse.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

In reply to: Hans-Jürgen Schönig (#1)
Re: PREPARE / EXECUTE

Greg Copeland wrote:

Could you use some form of connection proxy where the proxy is actually
keeping persistent connections but your application is making transient
connections to the proxy? I believe this would result in the desired
performance boost and behavior.

Now, the next obvious question...anyone know of any proxy apps available
for postgresql?

Regards,

Greg

There is one load balancing software available which is based on the ACE
library.
Just have a look at:

http://freshmeat.net/projects/dbbalancer/

I haven't tested it up to now.

I am now looking for a workaround - I am sure that there are many
workarounds for this issue (explicit joins, persistent connections, etc.
...).
I thought it might be useful to have something like a data type (or
maybe a binary field) used to store execution plans.

People could use this feature as some sort of "server side" function or
so ...
It can be seend as some sort of optimized function in the backend which
can be loaded/executed more efficiently.

Maybe others would like to see that feature as well.

Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

#11Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Hans-Jürgen Schönig (#10)
Re: PREPARE / EXECUTE

The idea is not to have it accross multiple backends and having it in
sync with the tables in the database. This is not the point.
My problem is that I have seen many performance critical applications
sending just a few complex queries to the server. The problem is: If you
have many queries where the relation "time planner"/"time executor" is
very high (eg. complex joins with just one value as the result).

The standard approach to such a scenario would imho be to write stored procedures
for the complex queries (e.g. plpgsql) and use that from the client.
Maybe even eliminate a few ping pongs between client and server.

Andreas

In reply to: Zeugswetter Andreas SB SD (#11)
Re: PREPARE / EXECUTE

The standard approach to such a scenario would imho be to write stored procedures
for the complex queries (e.g. plpgsql) and use that from the client.
Maybe even eliminate a few ping pongs between client and server.

Andreas

Does it reduce the time taken by the planner?
Are server side SQL functions optimized at runtime or at "create
function" time?

If the function is optimized at runtime it is not a gain.

Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at&gt;, cluster.postgresql.at
<http://cluster.postgresql.at&gt;, www.cybertec.at
<http://www.cybertec.at&gt;, kernel.cybertec.at <http://kernel.cybertec.at&gt;

#13Neil Conway
neilc@samurai.com
In reply to: Zeugswetter Andreas SB SD (#11)
Re: PREPARE / EXECUTE

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

The standard approach to such a scenario would imho be to write
stored procedures for the complex queries (e.g. plpgsql) and use
that from the client. Maybe even eliminate a few ping pongs between
client and server.

Since PL/PgSQL cached query plans are flushed when the backend exits,
how would this help?

Regarding the original suggestion of storing prepared plans on disk, I
agree with Tom -- it's basically the same idea as storing plans in
shared memory, which we previously considered (and Karel implemented),
but ultimately decided to remove. IMHO, the utility of this feature
doesn't justify the problems that would come with implementing it (see
the archives for the original implementation discussions).

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

#14Karel Zak
zakkr@zf.jcu.cz
In reply to: Tom Lane (#4)
Re: PREPARE / EXECUTE

On Wed, Oct 23, 2002 at 11:02:14AM -0400, Tom Lane wrote:

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres@cybertec.at> writes:

I wonder if there is a way to store a parsed/rewritten/planned query in
a table so that it can be loaded again.

The original version of the PREPARE patch used a shared-across-backends
cache for PREPAREd statements. We rejected that for a number of
reasons, one being the increased difficulty of keeping such a cache up
to date. I think actually storing the plans on disk would have all the
same problems, but worse.

Right.

There's solution: persisten backend (for example like classic
apache). This solve problem with lifetime of all persistent caches.
It's already in TODO.

Karel

--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz