random row from a subset

Started by Peter Koukoulisover 8 years ago4 messagesgeneral
Jump to latest
#1Peter Koukoulis
pkoukoulis@gmail.com

I'm attempting to get a random, based on a range that spans 1 to the
maximum number of rows that for a subset.
I run the query in Oracle sucessfully and get a different number each time
and only a single number, which is what I am expecting,

but when I run the same query, albeit the random function is different, I
either observe no result, a single row or two rows,
for example:

ft_node=# select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn
, count(*) over() max_rn
from customer where c_d_id=5
) t
where rn = (select floor(random()*(max_rn))+1);
c_id
------
2047
(1 row)

ft_node=# select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn
, count(*) over() max_rn
from customer where c_d_id=5
) t
where rn = (select floor(random()*(max_rn))+1);
c_id
------
(0 rows)

ft_node=# select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn
, count(*) over() max_rn
from customer where c_d_id=5
) t
where rn = (select floor(random()*(max_rn))+1);
c_id
------
1298
2608
(2 rows)

But in Oracle when I run the same query, I observe a consistent randomly
selected customer id, which is what I expecting:

SQL> select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn, count(*) over()
max_rn
from customer
where c_d_id=:d_id
)
where
rn = (select floor(dbms_random.value(1,max_rn+1)) from dual) 2 3 4
5 6 7 8 ;

C_ID
----------
2938

SQL> select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn, count(*) over()
max_rn
from customer
where c_d_id=:d_id
)
where
rn = (select floor(dbms_random.value(1,max_rn+1)) from dual) 2 3 4
5 6 7 8
9 ;

C_ID
----------
2204

SQL> select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn, count(*) over()
max_rn
from customer
where c_d_id=:d_id
)
where
rn = (select floor(dbms_random.value(1,max_rn+1)) from dual) 2 3 4
5 6 7 8
9 ;

C_ID
----------
2265

Can somebody help with formulating a SQL statement that would behave as how
the existing SQL statement does in Oracle, but not PostgreSQL?

#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Peter Koukoulis (#1)
Re: random row from a subset

On 09/20/2017 02:33 AM, Peter Koukoulis wrote:

I'm attempting to get a random, based on a range that spans 1 to the
maximum number of rows that for a subset.
I run the query in Oracle sucessfully and get a different number each
time and only a single number, which is what I am expecting,

but when I run the same query, albeit the random function is different,
I either observe no result, a single row or two rows, 
for example:

ft_node=# select c_id
from    (
         select c_id, row_number() over (order by c_d_id) as rn
              ,  count(*) over() max_rn
         from customer where c_d_id=5
        ) t
where rn = (select floor(random()*(max_rn))+1);

The problem here is that random() is volatile, so it's executed for each
row. So essentially if the subselect has 100 rows, you'll get 100 random
values. So you're "rolling the dice" for every row independently.
Sometimes one row matches, sometime none, sometime more than one.

You need to do either this:

with rand as (select random() as r)
select c_id
from (
select c_id, row_number() over (order by c_d_id) as rn
, count(*) over() max_rn
from customer where c_d_id=5
) t
where rn = (select floor(r*(max_rn))+1 from rand);

or define an immutable wrapper for random():

CREATE FUNCTION random_stable() RETURNS DOUBLE PRECISION
AS 'SELECT random()'
LANGUAGE SQL
IMMUTABLE;

and use that instead.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Peter Koukoulis
pkoukoulis@gmail.com
In reply to: Tomas Vondra (#2)
Re: random row from a subset

thanks, interestingly your method works in both Oracle and PostgreSQL,
albeit with a different random function call.
It does not work in SQL Anywhere though.

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Peter Koukoulis (#3)
Re: random row from a subset

On 09/20/2017 01:28 PM, bluefrog wrote:

thanks, interestingly your method works in both Oracle and PostgreSQL,
albeit with a different random function call.
It does not work in SQL Anywhere though.

You will have to ask SQL Anywhere people, I guess.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general