Query runs in 335ms; function in 100,239ms : date problem?

Started by Rory Campbell-Langeover 14 years ago7 messagesgeneral
Jump to latest
#1Rory Campbell-Lange
rory@campbell-lange.net

I have a function wrapping a (fairly complex) query.

The query itself runs in about a 1/3rd of a second. When running the
query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
over 100 seconds, about 300 times slower.

The function takes 3 input parameters: 2 dates and a boolean. The dates
(in_date_from, in_date_to) are used several times in the function.

When I replace the two parameters in the body of the query with, for
instance date'2011-05-01' and date'2011-08-01', the function operates
almost as speedily as the straight query.

I would be grateful to know how to work around this date problem.

As the function is rather large I have taken the liberty of posting it
here:
http://campbell-lange.net/media/files/fn_report_pers_leave.sql.html

Rory

#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Rory Campbell-Lange (#1)
Re: Query runs in 335ms; function in 100,239ms : date problem?

On 5 Září 2011, 23:07, Rory Campbell-Lange wrote:

I have a function wrapping a (fairly complex) query.

The query itself runs in about a 1/3rd of a second. When running the
query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
over 100 seconds, about 300 times slower.

The function takes 3 input parameters: 2 dates and a boolean. The dates
(in_date_from, in_date_to) are used several times in the function.

When I replace the two parameters in the body of the query with, for
instance date'2011-05-01' and date'2011-08-01', the function operates
almost as speedily as the straight query.

I would be grateful to know how to work around this date problem.

As the function is rather large I have taken the liberty of posting it
here:
http://campbell-lange.net/media/files/fn_report_pers_leave.sql.html

Do I understand correctly that you compare a query with literal parameters
with a parametrized query wrapped in a plpgsql function?

Try to run it as a prepared query - I guess you'll get about the same run
time as with the function (i.e. over 100 seconds).

The problem with prepared statements is that when planning the query, the
parameter values are unknown - so the optimizer does not know selectivity
of the conditions etc. and uses "common" values to prepare a safe plan.
OTOH the literal parameters allow to optimize the plan according to the
actual parameter values.

Tomas

#3Rory Campbell-Lange
rory@campbell-lange.net
In reply to: Tomas Vondra (#2)
Re: Query runs in 335ms; function in 100,239ms : date problem?

On 05/09/11, Tomas Vondra (tv@fuzzy.cz) wrote:

On 5 Z�??� 2011, 23:07, Rory Campbell-Lange wrote:

...

The query itself runs in about a 1/3rd of a second. When running the
query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
over 100 seconds, about 300 times slower.

...

As the function is rather large I have taken the liberty of posting it
here:
http://campbell-lange.net/media/files/fn_report_pers_leave.sql.html

Do I understand correctly that you compare a query with literal parameters
with a parametrized query wrapped in a plpgsql function?

Yes! Certainly I need to make the function perform more quickly.

Try to run it as a prepared query - I guess you'll get about the same run
time as with the function (i.e. over 100 seconds).

The prepared query runs in almost exactly the same time as the function,
but thanks for the suggestion. A very useful aspect of it is that I was
able to get the EXPLAIN output which I guess gives a fairly good picture
of the plan used for the function.

The explain output is here:
http://campbell-lange.net/media/files/explain.txt.html

I'm inexperienced in reading EXPLAIN output, but it looks like the
Nested Loop Semi Join at line 72 is running very slowly.

The problem with prepared statements is that when planning the query, the
parameter values are unknown - so the optimizer does not know selectivity
of the conditions etc. and uses "common" values to prepare a safe plan.
OTOH the literal parameters allow to optimize the plan according to the
actual parameter values.

Thank you very much for the helpful explanation.

Regards
Rory

#4Rory Campbell-Lange
rory@campbell-lange.net
In reply to: Rory Campbell-Lange (#3)
Re: Query runs in 335ms; function in 100,239ms : date problem?

On 05/09/11, Rory Campbell-Lange (rory@campbell-lange.net) wrote:

On 05/09/11, Tomas Vondra (tv@fuzzy.cz) wrote:

On 5 Z�??� 2011, 23:07, Rory Campbell-Lange wrote:

...

The query itself runs in about a 1/3rd of a second. When running the
query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
over 100 seconds, about 300 times slower.

...

Try to run it as a prepared query - I guess you'll get about the same run
time as with the function (i.e. over 100 seconds).

The prepared query runs in almost exactly the same time as the function,
but thanks for the suggestion. A very useful aspect of it is that I was
able to get the EXPLAIN output which I guess gives a fairly good picture
of the plan used for the function.

The explain output is here:
http://campbell-lange.net/media/files/explain.txt.html

I'm inexperienced in reading EXPLAIN output, but it looks like the
Nested Loop Semi Join at line 72 is running very slowly.

I added in more filtering conditions to the clause at line 72 and the
prepared statement dropped in runtime to 24043.902 ms. Unfortunately the
function ran slower -- 47957.796 -- but even that is a 50% improvement.

Thanks very much for your help.

Regards
Rory

#5Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Rory Campbell-Lange (#3)
Re: Query runs in 335ms; function in 100,239ms : date problem?

On 6 Září 2011, 0:27, Rory Campbell-Lange wrote:

On 05/09/11, Tomas Vondra (tv@fuzzy.cz) wrote:

Do I understand correctly that you compare a query with literal
parameters
with a parametrized query wrapped in a plpgsql function?

Yes! Certainly I need to make the function perform more quickly.

Try to run it as a prepared query - I guess you'll get about the same
run
time as with the function (i.e. over 100 seconds).

The prepared query runs in almost exactly the same time as the function,
but thanks for the suggestion. A very useful aspect of it is that I was
able to get the EXPLAIN output which I guess gives a fairly good picture
of the plan used for the function.

Well, my point was that the queries wrapped in functions are executed just
like prepared statements. And because prepared queries do not use
parameter values to optimize the plan, the result may be worse compared to
queries with literal parameters. So I was not expecting an improvement, I
was merely trying to show the problem.

The explain output is here:
http://campbell-lange.net/media/files/explain.txt.html

I'm inexperienced in reading EXPLAIN output, but it looks like the
Nested Loop Semi Join at line 72 is running very slowly.

I've posted the plan here: http://explain.depesz.com/s/uYX

Yes, the nested loop is the problem. One of the problems is that the join
condition - can you rewrite this

AND r.d_date || '-' || r.n_session || '-' || u.n_id IN
(SELECT
d_date || '-' || n_session || '-' || n_person
FROM
leave_association
WHERE
d_date >= in_date_from
AND d_date <= in_date_to
) -- i.e. leave where the person normally works

like this

AND EXISTS
(SELECT
1
FROM
leave_association
WHERE
d_date >= in_date_from
AND d_date <= in_date_to
AND d_date = r.d_date
AND n_session = r.n_session
AND n_person = u.n_id
) -- i.e. leave where the person normally works

and then do the same for the NOT IN subquery (=> NOT EXISTS). I think it
should return the same results, plus it might use indexes on the
leave_association. That was not possible because of the join condition.

Tomas

#6Torsten Zühlsdorff
foo@meisterderspiele.de
In reply to: Rory Campbell-Lange (#4)
Re: Query runs in 335ms; function in 100,239ms : date problem?

Rory Campbell-Lange schrieb:

Try to run it as a prepared query - I guess you'll get about the same run
time as with the function (i.e. over 100 seconds).

The prepared query runs in almost exactly the same time as the function,
but thanks for the suggestion. A very useful aspect of it is that I was
able to get the EXPLAIN output which I guess gives a fairly good picture
of the plan used for the function.

The explain output is here:
http://campbell-lange.net/media/files/explain.txt.html

I'm inexperienced in reading EXPLAIN output, but it looks like the
Nested Loop Semi Join at line 72 is running very slowly.

I added in more filtering conditions to the clause at line 72 and the
prepared statement dropped in runtime to 24043.902 ms. Unfortunately the
function ran slower -- 47957.796 -- but even that is a 50% improvement.

Also declare your function as STABLE. This can trigger an speed-increase.

Greetings,
Torsten
--
http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8
verschiedenen Datenbanksystemen abstrahiert,
Queries von Applikationen trennt und automatisch die Query-Ergebnisse
auswerten kann.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#5)
Re: Query runs in 335ms; function in 100,239ms : date problem?

"Tomas Vondra" <tv@fuzzy.cz> writes:

On 6 Září 2011, 0:27, Rory Campbell-Lange wrote:

The prepared query runs in almost exactly the same time as the function,
but thanks for the suggestion. A very useful aspect of it is that I was
able to get the EXPLAIN output which I guess gives a fairly good picture
of the plan used for the function.

Well, my point was that the queries wrapped in functions are executed just
like prepared statements. And because prepared queries do not use
parameter values to optimize the plan, the result may be worse compared to
queries with literal parameters. So I was not expecting an improvement, I
was merely trying to show the problem.

Right. The actual fix is to use EXECUTE so you force a new plan to be
generated each time. If you use EXECUTE USING to insert the parameter
values, you can avoid most of the notational mess this would otherwise
imply, as well as the risk of SQL-injection bugs from failing to quote
parameter values safely.

I'm hoping that 9.2 will be smart enough to not need such workarounds,
but that's where things stand at the moment.

regards, tom lane