Find users that have ALL categories
Is this the most efficient way to write this query? Id like to get a
list of users that have the categories 1, 2, and 3?
SELECT user_id FROM user_categories WHERE category_id IN (1,2,3) GROUP
BY user_id HAVING COUNT(*) = 3
users_categories (user_id, category_id)
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
3 | 1
4 | 1
4 | 2
4 | 3
The result should produce 1 & 4.
On Wed, Jun 30, 2010 at 12:11:35AM -0700, Nick wrote:
Is this the most efficient way to write this query? Id like to get a
list of users that have the categories 1, 2, and 3?SELECT user_id FROM user_categories WHERE category_id IN (1,2,3) GROUP
BY user_id HAVING COUNT(*) = 3users_categories (user_id, category_id)
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
3 | 1
4 | 1
4 | 2
4 | 3The result should produce 1 & 4.
The above method depends on (user_id, category_id) being unique, and
excludes users with, say, categories 1, 2, 3 and 4. Are you sure that
that latter is what you want?
This is, I believe, a little clearer as to what it's actually doing,
and doesn't exclude user_ids with more matches:
SELECT user_id
FROM user_categories
GROUP BY user_id
HAVING array_agg(category_id) @> ARRAY[1,2,3]
ORDER BY user_id; /* Not really needed, but could be handy */
In 9.0, you'll be able to use the following to get only exact matches:
SELECT user_id
FROM user_categories
GROUP BY user_id
HAVING array_agg(category_id ORDER BY category_id) = ARRAY[1,2,3]
ORDER BY user_id; /* Not really needed, but could be handy */
Until then, you can make an array_sort() function like this:
CREATE OR REPLACE FUNCTION array_sort(ANYARRAY)
RETURNS ANYARRAY
LANGUAGE SQL
STRICT
AS $$
SELECT ARRAY(
SELECT unnest($1) AS i
ORDER BY i
);
$$;
then use it like this:
SELECT user_id
FROM user_categories
GROUP BY user_id
HAVING array_sort(array_agg(category_id)) = ARRAY[1,2,3]
ORDER BY user_id;
to get only exact matches.
As to speed, you'd have to test on your actual data sets. Indexing
user_id may help here.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Jul 01, 2010 at 04:26:38AM -0700, David Fetter wrote:
On Wed, Jun 30, 2010 at 12:11:35AM -0700, Nick wrote:
Is this the most efficient way to write this query? Id like to get a
list of users that have the categories 1, 2, and 3?SELECT user_id FROM user_categories WHERE category_id IN (1,2,3) GROUP
BY user_id HAVING COUNT(*) = 3The above method depends on (user_id, category_id) being unique, and
excludes users with, say, categories 1, 2, 3 and 4. Are you sure that
that latter is what you want?
AFAICT, the above code will include a user with categories 1 to 4. Why
do you think otherwise?
If the (user_id,category_id) combination isn't unique, it's easy to
change the HAVING clause into HAVING COUNT(DISTINCT category_id) = 3.
--
Sam http://samason.me.uk/
On Thu, Jul 01, 2010 at 12:37:55PM +0100, Sam Mason wrote:
On Thu, Jul 01, 2010 at 04:26:38AM -0700, David Fetter wrote:
On Wed, Jun 30, 2010 at 12:11:35AM -0700, Nick wrote:
Is this the most efficient way to write this query? Id like to
get a list of users that have the categories 1, 2, and 3?SELECT user_id FROM user_categories WHERE category_id IN (1,2,3)
GROUP BY user_id HAVING COUNT(*) = 3The above method depends on (user_id, category_id) being unique,
and excludes users with, say, categories 1, 2, 3 and 4. Are you
sure that that latter is what you want?AFAICT, the above code will include a user with categories 1 to 4.
Why do you think otherwise?If the (user_id,category_id) combination isn't unique, it's easy to
change the HAVING clause into HAVING COUNT(DISTINCT category_id) =
3.
Oops. You're right, of course. That's what I get for posting before
waking up. ;)
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate