ORDER BY vs. volatile functions

Started by Andrew Gierthabout 16 years ago9 messages
#1Andrew Gierth
andrew@tao11.riddles.org.uk

This query:

select random() from generate_series(1,10) order by random();

produces sorted output. Should it?

--
Andrew (irc:RhodiumToad)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#1)
Re: ORDER BY vs. volatile functions

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

This query:
select random() from generate_series(1,10) order by random();
produces sorted output.
Should it?

It always has; we'd doubtless break some apps if we changed that.

regards, tom lane

#3Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#2)
Re: ORDER BY vs. volatile functions

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

This query:
select random() from generate_series(1,10) order by random();
produces sorted output.
Should it?

Tom> It always has; we'd doubtless break some apps if we changed that.

For bonus weirdness:

select distinct random(),random() from generate_series(1,10);
set enable_hashagg=off;
select distinct random(),random() from generate_series(1,10);

I think _that_ one is a bug.

--
Andrew (irc:RhodiumToad)

#4Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Andrew Gierth (#1)
Re: ORDER BY vs. volatile functions

Andrew Gierth wrote:

This query:

select random() from generate_series(1,10) order by random();
produces sorted output. Should it?

I recall a workaround from a different thread[1]http://archives.postgresql.org/pgsql-general/2006-11/msg01544.php if specifically
were looking for random ordering of random numbers is:
select random() from foo order by random()+1;

The thread has more odd corner cases with multiple calls
to random() and sorts as well.

[1]: http://archives.postgresql.org/pgsql-general/2006-11/msg01544.php

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#3)
Re: ORDER BY vs. volatile functions

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

For bonus weirdness:

select distinct random(),random() from generate_series(1,10);
set enable_hashagg=off;
select distinct random(),random() from generate_series(1,10);

I think _that_ one is a bug.

Hmm. I think the first one is a bug --- the two invocations of
random() in the tlist shouldn't be folded together.

regards, tom lane

#6Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#5)
Re: ORDER BY vs. volatile functions

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

For bonus weirdness:

select distinct random(),random() from generate_series(1,10);
set enable_hashagg=off;
select distinct random(),random() from generate_series(1,10);

I think _that_ one is a bug.

Tom> Hmm. I think the first one is a bug --- the two invocations of
Tom> random() in the tlist shouldn't be folded together.

That's what I meant.

If you try it using nextval(), you'll notice that the function does
in fact get called twice per row, but one of the results is thrown
away and replaced with the other one.

--
Andrew.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#6)
Re: ORDER BY vs. volatile functions

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

If you try it using nextval(), you'll notice that the function does
in fact get called twice per row, but one of the results is thrown
away and replaced with the other one.

Yeah. The problem is that setrefs.c is generating a tlist for the
hashagg node in which both output expressions point to the first
output of the underlying scan node, because it's just relying on
equal() to match up the expressions. I'm testing a fix now ...

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ron Mayer (#4)
Re: ORDER BY vs. volatile functions

Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:

[1] http://archives.postgresql.org/pgsql-general/2006-11/msg01544.php

FWIW, the behavior has changed from the time of that discussion ---
we now track sort ordering using EquivalenceClasses, which are able
to distinguish different instances of textually equal() volatile
expressions. The particular cases of
select random() from foo order by 1;
select random() from foo order by random();
still behave the same, but that's intentional for backwards
compatibility (and because SQL99 forbids the first, which would mean
there's no way to get the behavior except via deprecated syntax).
Beyond the case of order by/group by items being matched to tlist
items, I'd generally expect that the system should act as though
different textual instances of random() are evaluated separately.

regards, tom lane

#9Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#8)
Re: ORDER BY vs. volatile functions

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:

[1] http://archives.postgresql.org/pgsql-general/2006-11/msg01544.php

Tom> FWIW, the behavior has changed from the time of that discussion ---
Tom> we now track sort ordering using EquivalenceClasses, which are able
Tom> to distinguish different instances of textually equal() volatile
Tom> expressions. The particular cases of
Tom> select random() from foo order by 1;
Tom> select random() from foo order by random();
Tom> still behave the same, but that's intentional for backwards
Tom> compatibility (and because SQL99 forbids the first, which would mean
Tom> there's no way to get the behavior except via deprecated syntax).

SQL99 doesn't forbid:

select random() as r from foo order by r;

or

select r from (select random() as r from foo) as s order by r;

--
Andrew (irc:RhodiumToad)