Counting booleans in GROUP BY sections
Good evening,
I am trying to count the booleans per each GROUP BY section by the
following stored function:
CREATE OR REPLACE FUNCTION words_list_puzzles(
in_start interval,
in_end interval
) RETURNS TABLE (
out_label text,
out_count bigint,
out_puzzle boolean,
out_mid bigint,
out_secret text,
out_gid integer,
out_score integer
) AS
$func$
SELECT
TO_CHAR(played, 'Mon YYYY') AS label,
COUNT(NULLIF(puzzle, FALSE)), -- this only returns 0 or 1, why?
puzzle,
mid,
MD5(mid || ‘my little secret’) AS secret,
gid,
score
FROM words_moves
WHERE action = 'play'
AND LENGTH(hand) = 7
AND (LENGTH(letters) = 7 OR score > 90)
AND played > CURRENT_TIMESTAMP - in_start
AND played < CURRENT_TIMESTAMP - in_end
GROUP BY label, puzzle, mid, secret, gid, score
ORDER BY played DESC
$func$ LANGUAGE sql STABLE;
But when I run it, I only get 0 or 1 in the out_count column:
words_ru=> select * from words_list_puzzles(interval '2 year', interval '1
year');
out_label | out_count | out_puzzle | out_mid | out_secret
| out_gid | out_score
-----------+-----------+------------+---------+----------------------------------+---------+-----------
Nov 2018 | 0 | f | 1326876 |
e4928d3c34f50b8e6eabf7bad5b932fe | 46007 | 28
Nov 2018 | 0 | f | 1324466 |
6228ba509a7124f485feb5c1acbb6b68 | 45922 | 26
Nov 2018 | 0 | f | 1322050 |
b67b091d383678de392bf7370c735cab | 45877 | 34
Nov 2018 | 0 | f | 1320017 |
35f03b0c7159cec070c00aa80359fd42 | 44255 | 120
Nov 2018 | 0 | f | 1319160 |
83df42f7ad398bbb060fc02ddfdc62c0 | 45031 | 95
.....
May 2018 | 0 | f | 264251 |
2fff1154962966b16a2996387e30ae7f | 10946 | 99
May 2018 | 1 | t | 257620 |
645613db6ea40695dc967d8090ab3246 | 12713 | 93
May 2018 | 0 | f | 245792 |
bb75bfd9cb443ff541b199d893c68117 | 12359 | 24
May 2018 | 1 | t | 243265 |
d899a5d642ccd96d931194f48ef56d53 | 11950 | 123
May 2018 | 0 | f | 231953 |
ad53b5b2c0d4cced3d50e8b44ad53e55 | 11910 | 32
- while I was hoping to get 2 for the "May 2018" section.
What am I doing wrong please, why don't the values add up? Below is the
table desc:
words_ru=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
letters | text | | |
hand | text | | |
puzzle | boolean | | not null | false
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid) ON
DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid) ON
DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE
Thank you
Alex
On 11/29/19 8:38 AM, Alexander Farber wrote:
Good evening,
I am trying to count the booleans per each GROUP BY section by the
following stored function:CREATE OR REPLACE FUNCTION words_list_puzzles(
in_start interval,
in_end interval) RETURNS TABLE (
out_label text,
out_count bigint,
out_puzzle boolean,
out_mid bigint,
out_secret text,
out_gid integer,
out_score integer
) AS
$func$SELECT
TO_CHAR(played, 'Mon YYYY') AS label,
COUNT(NULLIF(puzzle, FALSE)), -- this only returns 0 or 1, why?
If I am following it is because you have mid in GROUP BY and mid is a
PK. Since mid will always be unique you will have at most on row per group.
puzzle,
mid,
MD5(mid || ‘my little secret’) AS secret,
gid,
scoreFROM words_moves
WHERE action = 'play'
AND LENGTH(hand) = 7
AND (LENGTH(letters) = 7 OR score > 90)
AND played > CURRENT_TIMESTAMP - in_start
AND played < CURRENT_TIMESTAMP - in_end
GROUP BY label, puzzle, mid, secret, gid, score
ORDER BY played DESC$func$ LANGUAGE sql STABLE;
But when I run it, I only get 0 or 1 in the out_count column:
words_ru=> select * from words_list_puzzles(interval '2 year', interval
'1 year');
out_label | out_count | out_puzzle | out_mid | out_secret
| out_gid | out_score
-----------+-----------+------------+---------+----------------------------------+---------+-----------
Nov 2018 | 0 | f | 1326876 |
e4928d3c34f50b8e6eabf7bad5b932fe | 46007 | 28
Nov 2018 | 0 | f | 1324466 |
6228ba509a7124f485feb5c1acbb6b68 | 45922 | 26
Nov 2018 | 0 | f | 1322050 |
b67b091d383678de392bf7370c735cab | 45877 | 34
Nov 2018 | 0 | f | 1320017 |
35f03b0c7159cec070c00aa80359fd42 | 44255 | 120
Nov 2018 | 0 | f | 1319160 |
83df42f7ad398bbb060fc02ddfdc62c0 | 45031 | 95
.....
May 2018 | 0 | f | 264251 |
2fff1154962966b16a2996387e30ae7f | 10946 | 99
May 2018 | 1 | t | 257620 |
645613db6ea40695dc967d8090ab3246 | 12713 | 93
May 2018 | 0 | f | 245792 |
bb75bfd9cb443ff541b199d893c68117 | 12359 | 24
May 2018 | 1 | t | 243265 |
d899a5d642ccd96d931194f48ef56d53 | 11950 | 123
May 2018 | 0 | f | 231953 |
ad53b5b2c0d4cced3d50e8b44ad53e55 | 11910 | 32- while I was hoping to get 2 for the "May 2018" section.
What am I doing wrong please, why don't the values add up? Below is the
table desc:words_ru=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
letters | text | | |
hand | text | | |
puzzle | boolean | | not null | false
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES
words_games(gid) ON DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES
words_users(uid) ON DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADEThank you
Alex
--
Adrian Klaver
adrian.klaver@aklaver.com
Thank you Adrian, but -
On Fri, Nov 29, 2019 at 6:45 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:
On 11/29/19 8:38 AM, Alexander Farber wrote:
CREATE OR REPLACE FUNCTION words_list_puzzles(
in_start interval,
in_end interval) RETURNS TABLE (
out_label text,
out_count bigint,
out_puzzle boolean,
out_mid bigint,
out_secret text,
out_gid integer,
out_score integer
) AS
$func$SELECT
TO_CHAR(played, 'Mon YYYY') AS label,
COUNT(NULLIF(puzzle, FALSE)), -- this only returns 0 or 1, why?If I am following it is because you have mid in GROUP BY and mid is a
PK. Since mid will always be unique you will have at most on row per group.puzzle,
mid,
MD5(mid || ‘my little secret’) AS secret,
gid,
scoreFROM words_moves
WHERE action = 'play'
AND LENGTH(hand) = 7
AND (LENGTH(letters) = 7 OR score > 90)
AND played > CURRENT_TIMESTAMP - in_start
AND played < CURRENT_TIMESTAMP - in_end
GROUP BY label, puzzle, mid, secret, gid, score
ORDER BY played DESC$func$ LANGUAGE sql STABLE;
But when I run it, I only get 0 or 1 in the out_count column:
words_ru=> select * from words_list_puzzles(interval '2 year', interval
'1 year');
out_label | out_count | out_puzzle | out_mid | out_secret
| out_gid | out_score-----------+-----------+------------+---------+----------------------------------+---------+-----------
Nov 2018 | 0 | f | 1326876 |
e4928d3c34f50b8e6eabf7bad5b932fe | 46007 | 28
Nov 2018 | 0 | f | 1324466 |
6228ba509a7124f485feb5c1acbb6b68 | 45922 | 26
Nov 2018 | 0 | f | 1322050 |
b67b091d383678de392bf7370c735cab | 45877 | 34
Nov 2018 | 0 | f | 1320017 |
35f03b0c7159cec070c00aa80359fd42 | 44255 | 120
Nov 2018 | 0 | f | 1319160 |
83df42f7ad398bbb060fc02ddfdc62c0 | 45031 | 95
.....
May 2018 | 0 | f | 264251 |
2fff1154962966b16a2996387e30ae7f | 10946 | 99
May 2018 | 1 | t | 257620 |
645613db6ea40695dc967d8090ab3246 | 12713 | 93
May 2018 | 0 | f | 245792 |
bb75bfd9cb443ff541b199d893c68117 | 12359 | 24
May 2018 | 1 | t | 243265 |
d899a5d642ccd96d931194f48ef56d53 | 11950 | 123
May 2018 | 0 | f | 231953 |
ad53b5b2c0d4cced3d50e8b44ad53e55 | 11910 | 32- while I was hoping to get 2 for the "May 2018" section.
What am I doing wrong please, why don't the values add up? Below is the
table desc:words_ru=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
letters | text | | |
hand | text | | |
puzzle | boolean | | not null | false
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES
words_games(gid) ON DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES
words_users(uid) ON DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE
if I remove GROUP BY mid, then I get the error:
ERROR: 42803: column "words_moves.mid" must appear in the GROUP BY clause
or be used in an aggregate function
LINE 18: mid,
^
LOCATION: check_ungrouped_columns_walker, parse_agg.c:1369
Regards
Alex
On Fri, Nov 29, 2019 at 12:48 PM Alexander Farber <
alexander.farber@gmail.com> wrote:
if I remove GROUP BY mid, then I get the error:
ERROR: 42803: column "words_moves.mid" must appear in the GROUP BY clause
or be used in an aggregate function
LINE 18: mid,
^
LOCATION: check_ungrouped_columns_walker, parse_agg.c:1369
Yes, you need to decide whether you want to output GROUPS (in which case
any detail more specific than your desired group needs to be aggregated) or
NOT (in which case you can probably use WINDOW functions to accomplish your
goal - count(...) OVER (PARTITION BY <the level of grouping you desire
knowledge about - year month it seems in this case>))
David J.
On 11/29/19 11:47 AM, Alexander Farber wrote:
Thank you Adrian, but -
if I remove GROUP BY mid, then I get the error:
ERROR: 42803: column "words_moves.mid" must appear in the GROUP BY
clause or be used in an aggregate function
LINE 18: mid,
^
LOCATION: check_ungrouped_columns_walker, parse_agg.c:1369
Yes because it is a case of opposing forces. When you remove mid from
the GROUP BY you get an single row for each group that has an aggregated
output where you can have count of > 1. In that case the database has
more then one choice for the mid to display and so it throws the error.
To go forward it would help to know what it is you are trying to achieve?
Regards
Alex
--
Adrian Klaver
adrian.klaver@aklaver.com
My context is that I have a table of player moves with PK mid (aka "move
id").
And I am able to find "interesting" moves by the high score or all 7 letter
tiles used.
But I do some human reviewing and set a "puzzle" boolean for truly
interesting moves.
For the reviewing tool I would like to display headers: a "Mon YYYY" plus
the number of true puzzles per section.
Thanks to David's hint the following seems to work even though I wonder if
it is the most optimal way to call TO_CHAR twice:
CREATE OR REPLACE FUNCTION words_list_puzzles(
in_start interval,
in_end interval
) RETURNS TABLE (
out_label text,
out_count bigint,
out_puzzle boolean,
out_mid bigint,
out_secret text,
out_gid integer,
out_score integer
) AS
$func$
SELECT
TO_CHAR(played, 'Mon YYYY') AS label,
-- used for header
COUNT(NULLIF(puzzle, FALSE)) OVER (PARTITION BY TO_CHAR(played,
'Mon YYYY')), --used for header
puzzle,
mid,
MD5(mid || 'my little secret') AS secret,
gid,
score
FROM words_moves
WHERE action = 'play'
AND LENGTH(hand) = 7
AND (LENGTH(letters) = 7 OR score > 90)
AND played > CURRENT_TIMESTAMP - in_start
AND played < CURRENT_TIMESTAMP - in_end
ORDER BY played DESC
$func$ LANGUAGE sql STABLE;
Regards
Alex
P.S: Below is my table description again and the output of the above
function:
words_ru=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
letters | text | | |
hand | text | | |
puzzle | boolean | | not null | false
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid) ON
DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid) ON
DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE
words_ru=> select * from words_list_puzzles(interval '2 year', interval '1
year');
out_label | out_count | out_puzzle | out_mid | out_secret
| out_gid | out_score
-----------+-----------+------------+---------+----------------------------------+---------+-----------
Nov 2018 | 1 | f | 1331343 |
78467b5f3bde3d3f2291cf539c949f79 | 46134 | 28
Nov 2018 | 1 | f | 1326876 |
e4928d3c34f50b8e6eabf7bad5b932fe | 46007 | 28
Nov 2018 | 1 | f | 1324466 |
6228ba509a7124f485feb5c1acbb6b68 | 45922 | 26
Nov 2018 | 1 | f | 1322050 |
b67b091d383678de392bf7370c735cab | 45877 | 34
Nov 2018 | 1 | f | 1320017 |
35f03b0c7159cec070c00aa80359fd42 | 44255 | 120
.....
May 2018 | 3 | f | 95114 |
e7e8bab64fab20f6fec229319e2bab40 | 7056 | 28
May 2018 | 3 | f | 88304 |
161c0638dede80f830a36efa6f428dee | 6767 | 40
May 2018 | 3 | f | 86180 |
4d47a65263331cf4e2d2956886b6a72f | 6706 | 26
May 2018 | 3 | f | 85736 |
debb1efd673c91947a8aa7f38be4217c | 6680 | 28
May 2018 | 3 | f | 82522 |
e55ec68a5a5dacc2bc463e397198cb1c | 6550 | 27
Apr 2018 | 0 | f | 78406 |
f5d264ccfe94aaccd90ce6c019716d4d | 5702 | 58
Apr 2018 | 0 | f | 77461 |
404886e913b698596f9cf3648ddf6fa4 | 1048 | 26
(415 rows)
On 11/30/19 4:08 AM, Alexander Farber wrote:
My context is that I have a table of player moves with PK mid (aka "move
id").And I am able to find "interesting" moves by the high score or all 7
letter tiles used.But I do some human reviewing and set a "puzzle" boolean for truly
interesting moves.For the reviewing tool I would like to display headers: a "Mon YYYY"
plus the number of true puzzles per section.Thanks to David's hint the following seems to work even though I wonder
if it is the most optimal way to call TO_CHAR twice:
Given that played contains values, I assume, that are at multiple points
in a month and you want the 'group' to be a month it looks alright to
me. Though if it bothers you then another option is date_trunc():
test=# select date_trunc('month', '11/02/2019 13:00'::timestamp),
date_trunc('month', '11/23/2019 13:00'::timestamp);
date_trunc | date_trunc
---------------------+---------------------
11/01/2019 00:00:00 | 11/01/2019 00:00:00
CREATE OR REPLACE FUNCTION words_list_puzzles(
in_start interval,
in_end interval
) RETURNS TABLE (
out_label text,
out_count bigint,
out_puzzle boolean,
out_mid bigint,
out_secret text,
out_gid integer,
out_score integer
) AS
$func$
SELECT
TO_CHAR(played, 'Mon YYYY') AS label,
-- used for header
COUNT(NULLIF(puzzle, FALSE)) OVER (PARTITION BY TO_CHAR(played,
'Mon YYYY')), --used for header
puzzle,
mid,
MD5(mid || 'my little secret') AS secret,
gid,
score
FROM words_moves
WHERE action = 'play'
AND LENGTH(hand) = 7
AND (LENGTH(letters) = 7 OR score > 90)
AND played > CURRENT_TIMESTAMP - in_start
AND played < CURRENT_TIMESTAMP - in_end
ORDER BY played DESC
$func$ LANGUAGE sql STABLE;Regards
AlexP.S: Below is my table description again and the output of the above
function:words_ru=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
letters | text | | |
hand | text | | |
puzzle | boolean | | not null | false
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES
words_games(gid) ON DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES
words_users(uid) ON DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADEwords_ru=> select * from words_list_puzzles(interval '2 year', interval
'1 year');
out_label | out_count | out_puzzle | out_mid | out_secret
| out_gid | out_score
-----------+-----------+------------+---------+----------------------------------+---------+-----------
Nov 2018 | 1 | f | 1331343 |
78467b5f3bde3d3f2291cf539c949f79 | 46134 | 28
Nov 2018 | 1 | f | 1326876 |
e4928d3c34f50b8e6eabf7bad5b932fe | 46007 | 28
Nov 2018 | 1 | f | 1324466 |
6228ba509a7124f485feb5c1acbb6b68 | 45922 | 26
Nov 2018 | 1 | f | 1322050 |
b67b091d383678de392bf7370c735cab | 45877 | 34
Nov 2018 | 1 | f | 1320017 |
35f03b0c7159cec070c00aa80359fd42 | 44255 | 120
.....
May 2018 | 3 | f | 95114 |
e7e8bab64fab20f6fec229319e2bab40 | 7056 | 28
May 2018 | 3 | f | 88304 |
161c0638dede80f830a36efa6f428dee | 6767 | 40
May 2018 | 3 | f | 86180 |
4d47a65263331cf4e2d2956886b6a72f | 6706 | 26
May 2018 | 3 | f | 85736 |
debb1efd673c91947a8aa7f38be4217c | 6680 | 28
May 2018 | 3 | f | 82522 |
e55ec68a5a5dacc2bc463e397198cb1c | 6550 | 27
Apr 2018 | 0 | f | 78406 |
f5d264ccfe94aaccd90ce6c019716d4d | 5702 | 58
Apr 2018 | 0 | f | 77461 |
404886e913b698596f9cf3648ddf6fa4 | 1048 | 26
(415 rows)
--
Adrian Klaver
adrian.klaver@aklaver.com