Collapsing multiple subqueries into one

Started by Chris Hanksover 14 years ago4 messagesgeneral
Jump to latest
#1Chris Hanks
christopher.m.hanks@gmail.com

I have two tables:

CREATE TABLE items
(
root_id integer NOT NULL,
id serial NOT NULL,
-- Other fields...

CONSTRAINT items_pkey PRIMARY KEY (root_id, id)
)

CREATE TABLE votes
(
root_id integer NOT NULL,
item_id integer NOT NULL,
user_id integer NOT NULL,
type smallint NOT NULL,
direction smallint,

CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type),
CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id)
REFERENCES items (root_id, id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
-- Other constraints...
)

I'm trying to, in a single query, pull out all items of a particular
root_id along with a few arrays of user_ids of the users who voted in
particular ways. The following query does what I need:

SELECT *,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = 1) as upvoters,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = -1) as downvoters,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 1) as favoriters
FROM items i
WHERE root_id = 1
ORDER BY id

The problem is that I'm using three subqueries to get the information
I need when it seems like I should be able to do the same in one. I
thought that Postgres (I'm using 8.4) might be smart enough to
collapse them all into a single query for me, but looking at the
explain output in pgAdmin it looks like that's not happening - it's
running multiple primary key lookups on the votes table instead. I
feel like I could rework this query to be more efficient, but I'm not
sure how.

Any pointers?

#2Royce Ausburn
royce.ml@inomial.com
In reply to: Chris Hanks (#1)
Re: Collapsing multiple subqueries into one

This might help you:

http://www.postgresql.org/docs/8.4/static/queries-with.html

On 24/08/2011, at 9:54 AM, Chris Hanks wrote:

Show quoted text

I have two tables:

CREATE TABLE items
(
root_id integer NOT NULL,
id serial NOT NULL,
-- Other fields...

CONSTRAINT items_pkey PRIMARY KEY (root_id, id)
)

CREATE TABLE votes
(
root_id integer NOT NULL,
item_id integer NOT NULL,
user_id integer NOT NULL,
type smallint NOT NULL,
direction smallint,

CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type),
CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id)
REFERENCES items (root_id, id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
-- Other constraints...
)

I'm trying to, in a single query, pull out all items of a particular
root_id along with a few arrays of user_ids of the users who voted in
particular ways. The following query does what I need:

SELECT *,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = 1) as upvoters,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = -1) as downvoters,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 1) as favoriters
FROM items i
WHERE root_id = 1
ORDER BY id

The problem is that I'm using three subqueries to get the information
I need when it seems like I should be able to do the same in one. I
thought that Postgres (I'm using 8.4) might be smart enough to
collapse them all into a single query for me, but looking at the
explain output in pgAdmin it looks like that's not happening - it's
running multiple primary key lookups on the votes table instead. I
feel like I could rework this query to be more efficient, but I'm not
sure how.

Any pointers?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Chris Hanks
christopher.m.hanks@gmail.com
In reply to: Royce Ausburn (#2)
Re: Collapsing multiple subqueries into one

Thanks Royce. I put together another query using a WITH statement
that's also working:

WITH v AS (
SELECT item_id, type, direction, array_agg(user_id) as user_ids
FROM votes
WHERE root_id = 5305
GROUP BY type, direction, item_id
ORDER BY type, direction, item_id
)
SELECT *,
(SELECT user_ids from v where item_id = i.id AND type = 0 AND
direction = 1) as upvoters,
(SELECT user_ids from v where item_id = i.id AND type = 0 AND
direction = -1) as downvoters,
(SELECT user_ids from v where item_id = i.id AND type = 1) as favoriters
FROM items i
WHERE root_id = 5305
ORDER BY id

It feels more sensible to me, but it's slightly slower than my initial
attempt (15 ms vs. 13 ms, when running as a prepared statement to
avoid any query parsing overhead, and averaging the time over several
thousand queries). I'm not sure why...?

Show quoted text

On Tue, Aug 23, 2011 at 8:14 PM, Royce Ausburn <royce.ml@inomial.com> wrote:

This might help you:

http://www.postgresql.org/docs/8.4/static/queries-with.html

On 24/08/2011, at 9:54 AM, Chris Hanks wrote:

I have two tables:

CREATE TABLE items
(
 root_id integer NOT NULL,
 id serial NOT NULL,
 -- Other fields...

 CONSTRAINT items_pkey PRIMARY KEY (root_id, id)
)

CREATE TABLE votes
(
 root_id integer NOT NULL,
 item_id integer NOT NULL,
 user_id integer NOT NULL,
 type smallint NOT NULL,
 direction smallint,

 CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type),
 CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id)
     REFERENCES items (root_id, id) MATCH SIMPLE
     ON UPDATE CASCADE ON DELETE CASCADE,
 -- Other constraints...
)

I'm trying to, in a single query, pull out all items of a particular
root_id along with a few arrays of user_ids of the users who voted in
particular ways. The following query does what I need:

SELECT *,
 ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = 1) as upvoters,
 ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = -1) as downvoters,
 ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 1) as favoriters
FROM items i
WHERE root_id = 1
ORDER BY id

The problem is that I'm using three subqueries to get the information
I need when it seems like I should be able to do the same in one. I
thought that Postgres (I'm using 8.4) might be smart enough to
collapse them all into a single query for me, but looking at the
explain output in pgAdmin it looks like that's not happening - it's
running multiple primary key lookups on the votes table instead. I
feel like I could rework this query to be more efficient, but I'm not
sure how.

Any pointers?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Royce Ausburn
royce.ml@inomial.com
In reply to: Chris Hanks (#3)
Re: Collapsing multiple subqueries into one

On 24/08/2011, at 4:44 PM, Chris Hanks wrote:

Thanks Royce. I put together another query using a WITH statement
that's also working:

WITH v AS (
SELECT item_id, type, direction, array_agg(user_id) as user_ids
FROM votes
WHERE root_id = 5305
GROUP BY type, direction, item_id
ORDER BY type, direction, item_id
)
SELECT *,
(SELECT user_ids from v where item_id = i.id AND type = 0 AND
direction = 1) as upvoters,
(SELECT user_ids from v where item_id = i.id AND type = 0 AND
direction = -1) as downvoters,
(SELECT user_ids from v where item_id = i.id AND type = 1) as favoriters
FROM items i
WHERE root_id = 5305
ORDER BY id

It feels more sensible to me, but it's slightly slower than my initial
attempt (15 ms vs. 13 ms, when running as a prepared statement to
avoid any query parsing overhead, and averaging the time over several
thousand queries). I'm not sure why...?

I'm not sure, Chris - perhaps others on the mailing list can answer this?

Show quoted text

On Tue, Aug 23, 2011 at 8:14 PM, Royce Ausburn <royce.ml@inomial.com> wrote:

This might help you:

http://www.postgresql.org/docs/8.4/static/queries-with.html

On 24/08/2011, at 9:54 AM, Chris Hanks wrote:

I have two tables:

CREATE TABLE items
(
root_id integer NOT NULL,
id serial NOT NULL,
-- Other fields...

CONSTRAINT items_pkey PRIMARY KEY (root_id, id)
)

CREATE TABLE votes
(
root_id integer NOT NULL,
item_id integer NOT NULL,
user_id integer NOT NULL,
type smallint NOT NULL,
direction smallint,

CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type),
CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id)
REFERENCES items (root_id, id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
-- Other constraints...
)

I'm trying to, in a single query, pull out all items of a particular
root_id along with a few arrays of user_ids of the users who voted in
particular ways. The following query does what I need:

SELECT *,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = 1) as upvoters,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 0 AND direction = -1) as downvoters,
ARRAY(SELECT user_id from votes where root_id = i.root_id AND
item_id = i.id AND type = 1) as favoriters
FROM items i
WHERE root_id = 1
ORDER BY id

The problem is that I'm using three subqueries to get the information
I need when it seems like I should be able to do the same in one. I
thought that Postgres (I'm using 8.4) might be smart enough to
collapse them all into a single query for me, but looking at the
explain output in pgAdmin it looks like that's not happening - it's
running multiple primary key lookups on the votes table instead. I
feel like I could rework this query to be more efficient, but I'm not
sure how.

Any pointers?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general