how to use aggregate functions in this case
Hi,
Thats my code snipped:
SELECT v_rec1.user,
sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as "0 to 25",
sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END) as "25 to 50",
sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END) as "50 to 100"
INTO v_rec2
GROUP BY user;
Now I want to summuarize the "0 to 25" values and the others in the same query.
Somehow like this: count("0 to 25")
But I want to do it with every single user and I don't know how to do that
A result should look like this:
user percentage count
smith "0 to 25" 5
smith "25 to 50" 7
smith "50 to 75" 2
jones "0 to 25" 11
jones "25 to 50" 1
jones "50 to 75" 3
Hope someone who can help me
Janek Sendrowski
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Janek Sendrowski wrote
Hi,
Thats my code snipped:
SELECT v_rec1.user,
sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as
"0 to 25",
sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END)
as "25 to 50",
sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END)
as "50 to 100"
INTO v_rec2
GROUP BY user;Now I want to summuarize the "0 to 25" values and the others in the same
query.
Somehow like this: count("0 to 25")
But I want to do it with every single user and I don't know how to do thatA result should look like this:
user percentage count
smith "0 to 25" 5
smith "25 to 50" 7
smith "50 to 75" 2
jones "0 to 25" 11
jones "25 to 50" 1
jones "50 to 75" 3Hope someone who can help me
Janek Sendrowski
--
Sent via pgsql-general mailing list (
pgsql-general@
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
You have to write three queries (select statements) and then "UNION ALL"
them together. Each sub-query has a where clause matching the range you
wish to aggregate for.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768523.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>
<div>Sorry, I formulated it wrong.</div>
<div>My problem is, that I want to count the ranges for every user, but if I use count(range), it counts the ranges of all users.</div>
<div> </div>
<div>Janek Sendrowski</div>
</div></div></body></html>
Janek Sendrowski wrote
Sorry, I formulated it wrong.
My problem is, that I want to count the ranges for every user, but if I
use count(range), it counts the ranges of all users.
Assuming your example output is indeed what you desire:
SELECT user, '0 to 25'::varchar AS percentage, count(*) FROM tbl WHERE val
BETWEEN 0 AND 25 GROUP BY user
UNION ALL
SELECT user, '25 to 50'::varchar, count(*) FROM tbl WHERE val BETWEEN 25 AND
50 GROUP BY user
...and so forth
Note that BETWEEN has inclusive end-points so anything with exactly 25 (for
example) is being double-counted.
Your other option is something like:
SELECT user, percentage_range_category, count(*) FROM (
SELECT recordid, user, CASE .... END::varchar AS
percentage_range_category
) categorize
GROUP BY user, percentage_range_category;
In short for every record you simply categorize the value then add that
category to your group-by.
Both are equally valid and the second one is probably easier to comprehend;
the first option just happened to occur to me first. I have no idea which
one would perform better in theory nor specifically with your data.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768525.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, Aug 25, 2013 at 5:59 PM, Janek Sendrowski <janek12@web.de> wrote:
SELECT v_rec1.user,
sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as
"0 to 25",
sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END)
as "25 to 50",
sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END)
as "50 to 100"
INTO v_rec2
GROUP BY user;Now I want to summuarize the "0 to 25" values and the others in the same
query.
Somehow like this: count("0 to 25")
But I want to do it with every single user and I don't know how to do thatA result should look like this:
user percentage count
smith "0 to 25" 5
smith "25 to 50" 7
smith "50 to 75" 2
jones "0 to 25" 11
jones "25 to 50" 1
jones "50 to 75" 3
This appears to be some kind of equal interval problem.
SELECT v_rec1.user,
WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4) AS bucket
COUNT(*) as count,
FROM v_rec2
GROUP BY user, bucket;
(Untested, but this should be the gist.)
Bucket 1 would be 0 to 25, bucket 2 is 25 to 50, 3 is 50 to 75, 4 is 75 to
100. If you really need to change the bucket number to some kind of text,
you can probably nest this query inside another that uses a CASE to pick
the text based on on the bucket number.
Good luck.
On Sun, Aug 25, 2013 at 8:36 PM, BladeOfLight16 <bladeoflight16@gmail.com>wrote:
This appears to be some kind of equal interval problem.
SELECT v_rec1.user,
WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4) AS bucket
COUNT(*) as count,
FROM v_rec2
GROUP BY user, bucket;(Untested, but this should be the gist.)
Bucket 1 would be 0 to 25, bucket 2 is 25 to 50, 3 is 50 to 75, 4 is 75 to
100. If you really need to change the bucket number to some kind of text,
you can probably nest this query inside another that uses a CASE to pick
the text based on on the bucket number.Good luck.
Then again, I guess you don't need a nested query.
SELECT v_rec1.user,
CASE WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4)
WHEN 1 THEN '0 to 25'
WHEN 2 THEN '25 to 50'
WHEN 3 THEN '50 to 75'
WHEN 4 THEN '75 to 100'
ELSE 'But how?'
END CASE AS quarter_percentage
COUNT(*) as count,
FROM v_rec2
GROUP BY user, quarter_percentage;
BladeOfLight16 wrote
Then again, I guess you don't need a nested query.
SELECT v_rec1.user,
CASE WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4)
WHEN 1 THEN '0 to 25'
WHEN 2 THEN '25 to 50'
WHEN 3 THEN '50 to 75'
WHEN 4 THEN '75 to 100'
ELSE 'But how?'
END CASE AS quarter_percentage
COUNT(*) as count,
FROM v_rec2
GROUP BY user, quarter_percentage;
This is clean but requires the use of equal intervals.
Another option, though I am unfamiliar with the exact syntax, is to use the
contains operator and an "intrange" (range type, for integer or whatever
actual type is needed).
SELECT CASE WHEN val @> '(0, 25]' -- '(25,50]', etc...
This allows for uneven bucket sizes and avoid the double-inclusive endpoints
problem that results from using BETWEEN.
Requires 9.2
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768573.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>
<div>Hi,</div>
<div> </div>
<div>thanks for all your answers.</div>
<div>I'll have a try with the contains operator and the intrange, but before I'd like to know if this would work:</div>
<div> </div>
<div>CASE WHEN a >= 0 AND a < 25</div>
<div>CASE WHEN a >= 25 AND a < 50</div>
<div> </div>
<div>There wouldn't be a double endpoint. I just have to decide which range the endpoint includes.</div>
<div> </div>
<div>Janek Sendrowski</div>
</div></div></body></html>
Janek Sendrowski wrote
Hi,
thanks for all your answers.
I'll have a try with the contains operator and the intrange, but
before I'd like to know if this would work:
CASE WHEN a >= 0 AND a < 25
CASE WHEN a >= 25 AND a < 50
There wouldn't be a double endpoint. I just have to decide which range
the endpoint includes.
Janek Sendrowski
Yes, using explicit comparison operators with "AND" will work just fine; its
just a little more verbose so the other options, if available, are preferred
from a readability standpoint. I do not know whether the different options
may have different performance characteristics.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768636.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
David Johnston wrote
Janek Sendrowski wrote
Hi,
thanks for all your answers.
I'll have a try with the contains operator and the intrange, but
before I'd like to know if this would work:
CASE WHEN a >= 0 AND a < 25
CASE WHEN a >= 25 AND a < 50
There wouldn't be a double endpoint. I just have to decide which
range the endpoint includes.
Janek Sendrowski
Yes, using explicit comparison operators with "AND" will work just fine;
its just a little more verbose so the other options, if available, are
preferred from a readability standpoint. I do not know whether the
different options may have different performance characteristics.David J.
And just for completeness:
WITH range_def (low, high, label) AS (
VALUES (0, 25, '0 to 25'::varchar), (25, 50, '25 to 50')
)
SELECT ...
FROM source_data
JOIN range_def ON (val >= low AND val < high)
[the rest of the query]
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768721.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general