Whats the most efficient query for this result?
I have three tables (users, books, pencils) and would like to get a
list of all users with a count and total price of their books and
pencils for 2012-01-01...
So with this data...
users (user_id)
1
2
3
books (user_id, price, created)
1 | $10 | 2012-01-01
1 | $10 | 2012-01-01
3 | $10 | 2012-01-01
pencils
1 | $.50 | 2012-01-02
3 | $.50 | 2012-01-01
3 | $.50 | 2012-01-02
What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50
Hi Nick,
On 17/01/12 00:18, Nick wrote:
I have three tables (users, books, pencils) and would like to get a
list of all users with a count and total price of their books and
pencils for 2012-01-01...So with this data...
users (user_id)
1
2
3books (user_id, price, created)
1 | $10 | 2012-01-01
1 | $10 | 2012-01-01
3 | $10 | 2012-01-01pencils
1 | $.50 | 2012-01-02
3 | $.50 | 2012-01-01
3 | $.50 | 2012-01-02What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50
Seems straightforward enough - left join the tables, group the result on
user_id - so I'd write it as:
select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id
order by u.user_id
If you need something more efficient, summary tables may help - hard to
say without knowing more about the real data.
Tom
On Jan 17, 3:33 am, t...@audioboundary.com (Tom Molesworth) wrote:
Hi Nick,
On 17/01/12 00:18, Nick wrote:
I have three tables (users, books, pencils) and would like to get a
list of all users with a count and total price of their books and
pencils for 2012-01-01...So with this data...
users (user_id)
1
2
3books (user_id, price, created)
1 | $10 | 2012-01-01
1 | $10 | 2012-01-01
3 | $10 | 2012-01-01pencils
1 | $.50 | 2012-01-02
3 | $.50 | 2012-01-01
3 | $.50 | 2012-01-02What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50Seems straightforward enough - left join the tables, group the result on
user_id - so I'd write it as:select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id
order by u.user_idIf you need something more efficient, summary tables may help - hard to
say without knowing more about the real data.Tom
--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general
Thanks Tom. Thats what I originally thought it would be, but my
results (disregarding the date clause) show that user 1 has 2 pencils
instead of 1, and user 3 has 2 books instead of 1.
I guess the LEFT JOIN is joining the first LEFT JOIN instead of the
users table.
Any other thoughts on how to get books and pencils to individually
LEFT JOIN the users table?
On Jan 17, 3:33 am, t...@audioboundary.com (Tom Molesworth) wrote:
Hi Nick,
On 17/01/12 00:18, Nick wrote:
I have three tables (users, books, pencils) and would like to get a
list of all users with a count and total price of their books and
pencils for 2012-01-01...So with this data...
users (user_id)
1
2
3books (user_id, price, created)
1 | $10 | 2012-01-01
1 | $10 | 2012-01-01
3 | $10 | 2012-01-01pencils
1 | $.50 | 2012-01-02
3 | $.50 | 2012-01-01
3 | $.50 | 2012-01-02What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50Seems straightforward enough - left join the tables, group the result on
user_id - so I'd write it as:select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id
order by u.user_idIf you need something more efficient, summary tables may help - hard to
say without knowing more about the real data.Tom
--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general
I think I figured it out. I have to add a where clause...
select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created =
'2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created =
'2012-01-01'
WHERE b.user_id = u.user_id AND p.user_id = u.user_id
group by u.user_id
order by u.user_id
On Jan 17, 3:33 am, t...@audioboundary.com (Tom Molesworth) wrote:
Hi Nick,
On 17/01/12 00:18, Nick wrote:
I have three tables (users, books, pencils) and would like to get a
list of all users with a count and total price of their books and
pencils for 2012-01-01...So with this data...
users (user_id)
1
2
3books (user_id, price, created)
1 | $10 | 2012-01-01
1 | $10 | 2012-01-01
3 | $10 | 2012-01-01pencils
1 | $.50 | 2012-01-02
3 | $.50 | 2012-01-01
3 | $.50 | 2012-01-02What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50Seems straightforward enough - left join the tables, group the result on
user_id - so I'd write it as:select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id
order by u.user_idIf you need something more efficient, summary tables may help - hard to
say without knowing more about the real data.Tom
--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general
I take that back, now im incorrectly not getting user 2's results
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Nick
Sent: Tuesday, January 17, 2012 12:59 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Whats the most efficient query for this result?
On Jan 17, 3:33 am, t...@audioboundary.com (Tom Molesworth) wrote:
Hi Nick,
On 17/01/12 00:18, Nick wrote:
I have three tables (users, books, pencils) and would like to get a
list of all users with a count and total price of their books and
pencils for 2012-01-01...So with this data...
users (user_id)
1
2
3books (user_id, price, created)
1 | $10 | 2012-01-01
1 | $10 | 2012-01-01
3 | $10 | 2012-01-01pencils
1 | $.50 | 2012-01-02
3 | $.50 | 2012-01-01
3 | $.50 | 2012-01-02What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50Seems straightforward enough - left join the tables, group the result
on user_id - so I'd write it as:select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id
order by u.user_idIf you need something more efficient, summary tables may help - hard
to say without knowing more about the real data.Tom
--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) To
make changes to your
subscription:http://www.postgresql.org/mailpref/pgsql-general
I take that back, now im incorrectly not getting user 2's results
---------------------------------------------------------------------
Not Tested & Psuedo Code But...
SELECT user_id, COALESCE(books_agg.count,0), COALESCE(books_agg.sum,0.00),
... [same for pencils]
FROM user u
LEFT JOIN ( SELECT user_id, COUNT(*), SUM(price) FROM book GROUP BY user_id
) books_agg ON ( books_agg.user_id = u.user_id )
LEFT JOIN ( SELECT user_id, COUNT(*), SUM(price) FROM pencils GROUP BY
user_id ) pencils_agg USING ( pencils_agg.user_id = u.user_id )
-- NOTE: THERE IS NO GROUP BY IN THIS PART OF THE QUERY; only in the
sub-queries
Basically do all your INDEPENDENT calculations first then simply JOIN the
various results to each other while replacing missing JOINs with reasonable
default values.
David J.
On 17/01/12 17:51, Nick wrote:
On Jan 17, 3:33 am, t...@audioboundary.com (Tom Molesworth) wrote:
Hi Nick,
On 17/01/12 00:18, Nick wrote:
What is the most efficient way to get this result...
query_result (user_id, book_count, book_price_total, pencil_count,
pencil_price_total)
1 | 2 | $20 | 0 | $0
2 | 0 | $0 | 0 | $0
3 | 1 | $10 | 1 | $.50Seems straightforward enough - left join the tables, group the result on
user_id - so I'd write it as:select u.user_id, count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total", count(p.user_id) as
"pencil_count", coalesce(sum(b.price), 0) as "pencil_price_total"
from users u
left join books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id
order by u.user_idIf you need something more efficient, summary tables may help - hard to
say without knowing more about the real data.Tom
--
Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-generalThanks Tom. Thats what I originally thought it would be, but my
results (disregarding the date clause) show that user 1 has 2 pencils
instead of 1, and user 3 has 2 books instead of 1.I guess the LEFT JOIN is joining the first LEFT JOIN instead of the
users table.Any other thoughts on how to get books and pencils to individually
LEFT JOIN the users table?
Looking at it again, I think the missing part is the created fields -
you'll probably need both of those in the group by clause as well to get
meaningful numbers. I think that makes:
select u.user_id,
count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total",
count(p.user_id) as "pencil_count",
coalesce(sum(p.price), 0) as "pencil_price_total"
from tst.users u
left join tst.books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join tst.pencils p on p.user_id = u.user_id and p.created =
'2012-01-01'
group by u.user_id, p.created, b.created
order by u.user_id;
Tom
On Jan 17, 2012, at 21:08, Tom Molesworth <tom@audioboundary.com> wrote:
On 17/01/12 17:51, Nick wrote:
On Jan 17, 3:33 am, t...@audioboundary.com (Tom Molesworth) wrote:
Looking at it again, I think the missing part is the created fields - you'll probably need both of those in the group by clause as well to get meaningful numbers. I think that makes:
select u.user_id,
count(b.user_id) as "book_count",
coalesce(sum(b.price), 0) as "book_price_total",
count(p.user_id) as "pencil_count",
coalesce(sum(p.price), 0) as "pencil_price_total"
from tst.users u
left join tst.books b on b.user_id = u.user_id and b.created = '2012-01-01'
left join tst.pencils p on p.user_id = u.user_id and p.created = '2012-01-01'
group by u.user_id, p.created, b.created
order by u.user_id;
Why?
What reason is there to include the "created" fields in the GROUP BY but not place them into the corresponding SELECT output?
The true issue is that the aggregates are operating on two independent joins. If you have 3 pencil records and two book records you end up effectively CROSS JOINing them to get 6 sale records which are then aggregated. You have to ensure that at most one record is on the right side of each join so that 1 X 1 -> 1. You can only do this by performing separate aggregations for each independent dataset.
David J.