percentile rank query
Hi all
I'm trying to calculate the percentile rank for a record based on a
'score' column, e.g. a column of integers such as:
23,77,88,23,23,23,12,12,12,13,13,13
without using a stored procedure.
So,
select count(*) as frequency, score
from scoretable
group by score
order by score
Yields:
frequency score
3 12
3 13
4 23
1 77
1 88
However I'd like this result set:
frequency score runningtotal
3 12 3
3 13 6
4 23 10
1 77 11
1 88 12
Where the running total is the previous frequency added to the current
frequency. Score order is significant.
So I can then do ((runningtotal-frequency)+(frequency/2))/(select
count(*) from scoretable) to give me the percentile rank for each
score.
Is this possible in one query? I just can't figure out how to get the
running total in a result set.
Thanks very much,
Will Temperley
William Temperley escreveu:
Hi all
I'm trying to calculate the percentile rank for a record based on a
'score' column, e.g. a column of integers such as:
23,77,88,23,23,23,12,12,12,13,13,13
without using a stored procedure.So,
select count(*) as frequency, score
from scoretable
group by score
order by scoreYields:
frequency score
3 12
3 13
4 23
1 77
1 88However I'd like this result set:
frequency score runningtotal
3 12 3
3 13 6
4 23 10
1 77 11
1 88 12Where the running total is the previous frequency added to the current
frequency. Score order is significant.So I can then do ((runningtotal-frequency)+(frequency/2))/(select
count(*) from scoretable) to give me the percentile rank for each
score.Is this possible in one query? I just can't figure out how to get the
running total in a result set.
Try:
SELECT count(*) AS frequency, score,
count((SELECT * FROM scoretable st2 WHERE st2.score <= st1.score)) AS
runningtotal
FROM scoretable st1
GROUP BY score
ORDER BY score
Osvaldo
On Thu, Apr 10, 2008 at 4:36 PM, Osvaldo Rosario Kussama
<osvaldo.kussama@gmail.com> wrote:
Try:
SELECT count(*) AS frequency, score,
count((SELECT * FROM scoretable st2 WHERE st2.score <= st1.score)) AS
runningtotal
FROM scoretable st1
GROUP BY score
ORDER BY scoreOsvaldo
Thankyou Osvaldo- that worked!
Final version:
SELECT count(*) AS frequency, score,
((((SELECT count(uid) FROM scoretable st2 WHERE st2.score <=
st1.score) - count(*)) + (count(*)/2))::float/(select
count(*) from scoretable))
FROM scoretable st1
GROUP BY score
ORDER BY score
I think that's a percentile rank now.
Cheers
Will
On Thu, Apr 10, 2008 at 05:20:21PM +0100, William Temperley wrote:
SELECT count(*) AS frequency, score,
((((SELECT count(uid) FROM scoretable st2 WHERE st2.score <=
st1.score) - count(*)) + (count(*)/2))::float/(select
count(*) from scoretable))FROM scoretable st1
GROUP BY score
ORDER BY scoreI think that's a percentile rank now.
I'm not quite sure how this is calculated but I think you may want to
be converting to a non-integral type earlier (i.e. as you're dividing
by two, not after). I also find all the subselects a bit difficult to
follow so have moved them around:
SELECT x.frequency, x.score, (x.rank + x.frequency / 2.0) / y.total AS pr
FROM (
SELECT count(*) AS frequency, score,
(SELECT count(*) FROM scoretable t WHERE t.score < s.score) AS rank
FROM scoretable s
GROUP BY score) x, (SELECT count(*) AS total FROM scoretable) y
ORDER BY score;
Sam