Subselects in select expressions

Started by Royce Ausburnover 14 years ago3 messagesgeneral
Jump to latest
#1Royce Ausburn
royce.ml@inomial.com

I'm having trouble understanding why these two queries produce different results:

test=# select (select random()) from generate_series(1,10); -- rows are the same
?column?
-------------------
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
0.770797704812139
(10 rows)

test=# select (select random() where generate_series is not null) from generate_series(1,10); -- rows are different
?column?
--------------------
0.561828337144107
0.0275383000262082
0.290950470604002
0.281174722127616
0.530742571223527
0.617655908688903
0.169655770529062
0.428002137690783
0.442224354483187
0.203044794034213
(10 rows)

I understand that it's likely an optimisation thing -- postgres knows that the subselect doesn't depend on the FROM rows so it evaluates it only once, but is this really correct behaviour? Ideally, shouldn't postgres know that each invocation of random() produces different results and so decide that it should execute it for each row? If not, why?

Thanks!

--Royce

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Royce Ausburn (#1)
Re: Subselects in select expressions

Royce Ausburn <royce.ml@inomial.com> writes:

[ random() is evaluated only once in ]
test=# select (select random()) from generate_series(1,10); -- rows are the same

I understand that it's likely an optimisation thing -- postgres knows
that the subselect doesn't depend on the FROM rows so it evaluates it
only once, but is this really correct behaviour?

I can't find anything in the SQL standard that explicitly addresses this
point, but a bit of googling turns up documentation for both Oracle and
DB/2 that say they evaluate uncorrelated subqueries just once:
http://www.smart-soft.co.uk/Oracle/advanced-sql-tutorial.htm
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/sqlp/rbafycorrs.htm
whereas correlated subqueries are evaluated once per row.

So we're in good company here. Even if we were not, I'd be loath to
change it, because people have historically relied on this behavior to
force single evaluation of what otherwise would be considered volatile
functions.

regards, tom lane

#3Royce Ausburn
royce.ml@inomial.com
In reply to: Tom Lane (#2)
Re: Subselects in select expressions

On 02/11/2011, at 2:16 PM, Tom Lane wrote:

Royce Ausburn <royce.ml@inomial.com> writes:

[ random() is evaluated only once in ]
test=# select (select random()) from generate_series(1,10); -- rows are the same

I understand that it's likely an optimisation thing -- postgres knows
that the subselect doesn't depend on the FROM rows so it evaluates it
only once, but is this really correct behaviour?

I can't find anything in the SQL standard that explicitly addresses this
point, but a bit of googling turns up documentation for both Oracle and
DB/2 that say they evaluate uncorrelated subqueries just once:
http://www.smart-soft.co.uk/Oracle/advanced-sql-tutorial.htm
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/sqlp/rbafycorrs.htm
whereas correlated subqueries are evaluated once per row.

So we're in good company here. Even if we were not, I'd be loath to
change it, because people have historically relied on this behavior to
force single evaluation of what otherwise would be considered volatile
functions.

That's good enough for me, Tom. Thanks =)