Function returning 2 columns evaluated twice when both columns are needed

Started by Gerhard Wiesingerover 16 years ago11 messagesgeneral
Jump to latest
#1Gerhard Wiesinger
lists@wiesinger.com

Hello,

I'm having a problem with the following:
CREATE TYPE Sums AS (sum_m1 double precision, sum_m2 double precision);
CREATE TYPE date_m1_m2 AS (cur_date date, sum_m1 double precision, sum_m2 double precision);

CREATE OR REPLACE FUNCTION getSums(IN start_ts timestamp with time
zone, IN stop_ts timestamp with time zone) RETURNS Sums AS $$
...

CREATE OR REPLACE FUNCTION getsumInterval(date, date) RETURNS SETOF date_m1_m2 AS $$
SELECT
cur_date,
(getSums(start_ts, stop_ts)).* -- No optimal since function is evaluated 2 times => 24s
-- getSums(start_ts, stop_ts) -- in one column and not usable as I need 2 columns, but takes only 12s
FROM
getDatesTimestamps($1, $2)
;
$$ LANGUAGE SQL;

Since getSums() is a cursor and is complex and takes long time getSums
should only be evaluated once. Is there a better solution available to
get both columns from the function in the select?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gerhard Wiesinger (#1)
Re: Function returning 2 columns evaluated twice when both columns are needed

Gerhard Wiesinger <lists@wiesinger.com> writes:

Since getSums() is a cursor and is complex and takes long time getSums
should only be evaluated once. Is there a better solution available to
get both columns from the function in the select?

You need a sub-select, along the lines of

SELECT
cur_date,
(gs).sum_m1,
(gs).sum_m2
FROM
(
SELECT
cur_date,
getSums(start_ts, stop_ts) AS gs
FROM
getDatesTimestamps($1, $2)
OFFSET 0
) AS ss
;

The OFFSET bit is a kluge, but is needed to keep the planner from
flattening the subquery and undoing your work.

regards, tom lane

#3Gerhard Wiesinger
lists@wiesinger.com
In reply to: Tom Lane (#2)
Re: Function returning 2 columns evaluated twice when both columns are needed

On Sun, 18 Oct 2009, Tom Lane wrote:

Gerhard Wiesinger <lists@wiesinger.com> writes:

Since getSums() is a cursor and is complex and takes long time getSums
should only be evaluated once. Is there a better solution available to
get both columns from the function in the select?

You need a sub-select, along the lines of

SELECT
cur_date,
(gs).sum_m1,
(gs).sum_m2
FROM
(
SELECT
cur_date,
getSums(start_ts, stop_ts) AS gs
FROM
getDatesTimestamps($1, $2)
OFFSET 0
) AS ss
;

The OFFSET bit is a kluge, but is needed to keep the planner from
flattening the subquery and undoing your work.

Thnx Tom. It also works without the OFFSET kludge. Any ideas why?
Looks also reasonable to me because there is no SETOF returned.

BTW: Why is the function in the original statement evaluated twice? On
"SELECT table.*" I guess query is also executed once and not n times (for
each column).

PG is version 8.3.8.

Thnx.

Ciao,
Gerhard

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gerhard Wiesinger (#3)
Re: Function returning 2 columns evaluated twice when both columns are needed

Gerhard Wiesinger <lists@wiesinger.com> writes:

On Sun, 18 Oct 2009, Tom Lane wrote:

The OFFSET bit is a kluge, but is needed to keep the planner from
flattening the subquery and undoing your work.

Thnx Tom. It also works without the OFFSET kludge. Any ideas why?

Probably because you have the function declared VOLATILE.

regards, tom lane

#5Gerhard Wiesinger
lists@wiesinger.com
In reply to: Tom Lane (#4)
Re: Function returning 2 columns evaluated twice when both columns are needed

On Mon, 19 Oct 2009, Tom Lane wrote:

Gerhard Wiesinger <lists@wiesinger.com> writes:

On Sun, 18 Oct 2009, Tom Lane wrote:

The OFFSET bit is a kluge, but is needed to keep the planner from
flattening the subquery and undoing your work.

Thnx Tom. It also works without the OFFSET kludge. Any ideas why?

Probably because you have the function declared VOLATILE.

None of the function is declared VOLATILE. Any other idea?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

#6Christophe Pettus
xof@thebuild.com
In reply to: Gerhard Wiesinger (#5)
Re: Function returning 2 columns evaluated twice when both columns are needed

On Oct 19, 2009, at 10:49 AM, Gerhard Wiesinger wrote:

None of the function is declared VOLATILE. Any other idea?

If they are not declared with a volatility category at all, the
default is VOLATILE. Is that a possibility?

--
-- Christophe Pettus
xof@thebuild.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gerhard Wiesinger (#5)
Re: Function returning 2 columns evaluated twice when both columns are needed

Gerhard Wiesinger <lists@wiesinger.com> writes:

On Mon, 19 Oct 2009, Tom Lane wrote:

Probably because you have the function declared VOLATILE.

None of the function is declared VOLATILE. Any other idea?

[ shrug... ] There are other possible reasons why the planner would
fail to flatten a subquery, but none of them apply to the example you
showed. And your example function *was* VOLATILE, by default.

regards, tom lane

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#7)
Re: Function returning 2 columns evaluated twice when both columns are needed

Hello

2009/10/19 Tom Lane <tgl@sss.pgh.pa.us>:

Gerhard Wiesinger <lists@wiesinger.com> writes:

On Mon, 19 Oct 2009, Tom Lane wrote:

Probably because you have the function declared VOLATILE.

None of the function is declared VOLATILE. Any other idea?

[ shrug... ]  There are other possible reasons why the planner would
fail to flatten a subquery, but none of them apply to the example you
showed.  And your example function *was* VOLATILE, by default.

I checked this on 8.5 and function is evaluated more time although is immutable.

postgres=# create or replace function foo(out a int, out b int)
returns record as $$
begin
raise notice 'start foo';
a := 10; b := 20;
return;
end;
$$ language plpgsql immutable;
CREATE FUNCTION

postgres=# select (foo()).*;
NOTICE: start foo
NOTICE: start foo
a │ b
────┼────
10 │ 20
(1 row)

I was surprised, there are necessary subselect, but "offset" is optional:

postgres=# select (foo).* from (select foo()) f;
NOTICE: start foo
a │ b
────┼────
10 │ 20
(1 row)

postgres=# select (foo).* from (select foo() offset 0) f;
NOTICE: start foo
a │ b
────┼────
10 │ 20
(1 row)

regards
Pavel Stehule

Show quoted text

                       regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#8)
Re: Function returning 2 columns evaluated twice when both columns are needed

On Tue, Oct 20, 2009 at 5:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

[ shrug... ]  There are other possible reasons why the planner would
fail to flatten a subquery, but none of them apply to the example you
showed.  And your example function *was* VOLATILE, by default.

I checked this on 8.5 and function is evaluated more time although is immutable.

postgres=# create or replace function foo(out a int, out b int)
returns record as $$
begin
raise notice 'start foo';
a := 10; b := 20;
return;
end;
$$ language plpgsql immutable;
CREATE FUNCTION

postgres=# select (foo()).*;

This is because select (func()).* is expanded to mean:
select func(f1), func(f2) ... func(fn);

This is a general issue with '*' because in postgres it means:
'evaluate me for each field of me', not 'return all fields of me'. I
don't think our behavior in this regard is correct (afaict i'm in the
minority though).

merlin

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#9)
Re: Function returning 2 columns evaluated twice when both columns are needed

2009/10/21 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Oct 20, 2009 at 5:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

[ shrug... ]  There are other possible reasons why the planner would
fail to flatten a subquery, but none of them apply to the example you
showed.  And your example function *was* VOLATILE, by default.

I checked this on 8.5 and function is evaluated more time although is immutable.

postgres=# create or replace function foo(out a int, out b int)
returns record as $$
begin
raise notice 'start foo';
a := 10; b := 20;
return;
end;
$$ language plpgsql immutable;
CREATE FUNCTION

postgres=# select (foo()).*;

This is because select (func()).* is expanded to mean:
select func(f1), func(f2) ... func(fn);

This is a general issue with '*' because in postgres it means:
'evaluate me for each field of me', not 'return all fields of me'. I
don't think our behavior in this regard is correct (afaict i'm in the
minority though).

I understand to this mechanism. This is only correction some previous
messages. This behave isn't depend on function immutability or
volatility. But I agree with you, so this is really problem - it is
very silent. Maybe we could to raise some warning or we could to move
funccall to subselect

like
SELECT (foo()).* to SELECT ( (SELECT foo()) ).*

Regards
Pavel Stehule

Show quoted text

merlin

#11Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#10)
Re: Function returning 2 columns evaluated twice when both columns are needed

On Wed, Oct 21, 2009 at 12:37 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2009/10/21 Merlin Moncure <mmoncure@gmail.com>:

On Tue, Oct 20, 2009 at 5:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

[ shrug... ]  There are other possible reasons why the planner would
fail to flatten a subquery, but none of them apply to the example you
showed.  And your example function *was* VOLATILE, by default.

I checked this on 8.5 and function is evaluated more time although is immutable.

postgres=# create or replace function foo(out a int, out b int)
returns record as $$
begin
raise notice 'start foo';
a := 10; b := 20;
return;
end;
$$ language plpgsql immutable;
CREATE FUNCTION

postgres=# select (foo()).*;

This is because select (func()).* is expanded to mean:
select func(f1), func(f2) ... func(fn);

This is a general issue with '*' because in postgres it means:
'evaluate me for each field of me', not 'return all fields of me'. I
don't think our behavior in this regard is correct (afaict i'm in the
minority though).

I understand to this mechanism. This is only correction some previous
messages. This behave isn't depend on function immutability or
volatility. But I agree with you, so this is really problem - it is
very silent. Maybe we could to raise some warning or we could to move
funccall to subselect

like
SELECT (foo()).* to SELECT ( (SELECT foo()) ).*

If we are going to change I think ultimately the best answer is that
'(me).*' should mean: 'return all the fields of me', whatever 'me' is,
meaning that:

create view v as select (foo).* from foo;

should now change the definition of v if we add a column to foo. As opposed to:

create view v as select * from foo;

which would not (and shouldn't).

merlin