A complex plproxy query

Started by Igor Katsonabout 17 years ago9 messagesgeneral
Jump to latest
#1Igor Katson
descentspb@gmail.com

This is a complex question, and I couldn't form it in a short and easy
way, and I'm sorry for that.

First of all, let me introduce you to the DB (to form a question), for
you to understand what am I talking about. The module looks like a
social network, just the users have friends, which can be in different
groups.

Also it is clustered with PLPROXY by user_id, so the user itself, and
his friends list (the list of ID's) is always in the same DB, but the
information about the friends is not (it is clustered through all the
partitions). Here is a little sketch of a the needed tables:

CREATE TABLE friend
(
id bigint,
user_id integer,
friend_id integer,
group_id bigint,
...
);
This table is a 'friend link' from one user to another, which can be
marked as being in some 'group', and the backward link exists also (from
the 2nd user to the 1st), which can possibly be in another 'group'.

CREATE TABLE user
(
user_id integer,
nickname text,
.... -- lots of other info
);
This is just a user table.

Both of these are clustered by user_id. I need to form the following
query, for it to be as fast as possible (here it is written as if it the
DB was not partitioned):
SELECT something FROM user u, friend f
WHERE u.user_id = f.friend.id
AND f.user_id = $1 (this is given as an argument)
AND f.group_id = $2

So to say, give me the list of friends (not only their ID's, but all the
needed columns!) of given individual, which are in a given group. That
seems ok without plproxy, but with using it, I can't imagine how can I
form a nice query, or a function (or a set of plpgsql + plproxy
functions) to do the job.

Thanks in advance and regards,
Igor Katson.

#2Hannu Krosing
hannu@tm.ee
In reply to: Igor Katson (#1)
Re: [Plproxy-users] A complex plproxy query

On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:

So to say, give me the list of friends (not only their ID's, but all the
needed columns!) of given individual, which are in a given group. That
seems ok without plproxy, but with using it, I can't imagine how can I
form a nice query, or a function (or a set of plpgsql + plproxy
functions) to do the job.

You need to do it in two steps - first run a query on the partition the
user is in to get list of friends ids, then run a second RUN ON ALL
query with

WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2

to gather all friend info in parallel

--
------------------------------------------
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

#3Igor Katson
descentspb@gmail.com
In reply to: Hannu Krosing (#2)
Re: [Plproxy-users] A complex plproxy query

Hannu Krosing wrote:

On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:

So to say, give me the list of friends (not only their ID's, but all the
needed columns!) of given individual, which are in a given group. That
seems ok without plproxy, but with using it, I can't imagine how can I
form a nice query, or a function (or a set of plpgsql + plproxy
functions) to do the job.

You need to do it in two steps - first run a query on the partition the
user is in to get list of friends ids, then run a second RUN ON ALL
query with

WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2

to gather all friend info in parallel

I was thinking about that. But I don't understand, how can I pass the
list of id's. Should I turn the output of a select into an array? How
then? What if the array gets hundreds of items long?

#4Marko Kreen
markokr@gmail.com
In reply to: Igor Katson (#3)
Re: [Plproxy-users] A complex plproxy query

On 1/22/09, Igor Katson <descentspb@gmail.com> wrote:

Hannu Krosing wrote:

On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:

So to say, give me the list of friends (not only their ID's, but all the
needed columns!) of given individual, which are in a given group. That
seems ok without plproxy, but with using it, I can't imagine how can I
form a nice query, or a function (or a set of plpgsql + plproxy
functions) to do the job.

You need to do it in two steps - first run a query on the partition the
user is in to get list of friends ids, then run a second RUN ON ALL
query with

WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2

to gather all friend info in parallel

I was thinking about that. But I don't understand, how can I pass the
list of id's. Should I turn the output of a select into an array? How
then? What if the array gets hundreds of items long?

Yes, array works fine. And if it's long, then let it be long...

--
marko

#5Igor Katson
descentspb@gmail.com
In reply to: Marko Kreen (#4)
Re: [Plproxy-users] A complex plproxy query

Marko Kreen wrote:

On 1/22/09, Igor Katson <descentspb@gmail.com> wrote:

Hannu Krosing wrote:

On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:

So to say, give me the list of friends (not only their ID's, but all the
needed columns!) of given individual, which are in a given group. That
seems ok without plproxy, but with using it, I can't imagine how can I
form a nice query, or a function (or a set of plpgsql + plproxy
functions) to do the job.

You need to do it in two steps - first run a query on the partition the
user is in to get list of friends ids, then run a second RUN ON ALL
query with

WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2

to gather all friend info in parallel

I was thinking about that. But I don't understand, how can I pass the
list of id's. Should I turn the output of a select into an array? How
then? What if the array gets hundreds of items long?

Yes, array works fine. And if it's long, then let it be long...

Ok, thank you, guys. What is the best way to make an array out of a
column? I didn't make up anything better then writing a function:

CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
AS $$
DECLARE
arr int[];
rec int;
BEGIN
FOR rec IN EXECUTE query
LOOP
arr := array_append('{}',rec);
END LOOP;
RETURN arr;
END;
$$ language plpgsql;

#6Hannu Krosing
hannu@tm.ee
In reply to: Igor Katson (#5)
Re: [Plproxy-users] A complex plproxy query

On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:

Ok, thank you, guys. What is the best way to make an array out of a
column? I didn't make up anything better then writing a function:

CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
AS $$
DECLARE
arr int[];
rec int;
BEGIN
FOR rec IN EXECUTE query
LOOP
arr := array_append('{}',rec);
END LOOP;
RETURN arr;
END;
$$ language plpgsql;

hannu=# select ARRAY(select usename from pg_user);
?column?
-------------------------------
{postgres,hannu,m1,skyncuser}
(1 row)

--
------------------------------------------
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

#7Igor Katson
descentspb@gmail.com
In reply to: Hannu Krosing (#6)
Re: [Plproxy-users] A complex plproxy query

Hannu Krosing wrote:

On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:

Ok, thank you, guys. What is the best way to make an array out of a
column? I didn't make up anything better then writing a function:

CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
AS $$
DECLARE
arr int[];
rec int;
BEGIN
FOR rec IN EXECUTE query
LOOP
arr := array_append('{}',rec);
END LOOP;
RETURN arr;
END;
$$ language plpgsql;

hannu=# select ARRAY(select usename from pg_user);
?column?
-------------------------------
{postgres,hannu,m1,skyncuser}
(1 row)

Lots of thanks! I tried the same one, but with ARRAY[], so i didn't get
anything.

#8Hannu Krosing
hannu@tm.ee
In reply to: Hannu Krosing (#6)
Re: [Plproxy-users] A complex plproxy query

On Thu, 2009-01-22 at 16:13 +0200, Hannu Krosing wrote:

On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:

Ok, thank you, guys. What is the best way to make an array out of a
column? I didn't make up anything better then writing a function:

CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
AS $$
DECLARE
arr int[];
rec int;
BEGIN
FOR rec IN EXECUTE query
LOOP
arr := array_append('{}',rec);
END LOOP;
RETURN arr;
END;
$$ language plpgsql;

hannu=# select ARRAY(select usename from pg_user);
?column?
-------------------------------
{postgres,hannu,m1,skyncuser}
(1 row)

So what yo need is

select * from
gather_friend_info_from_all_partitions(get_friend_id_list_form_user_partition(user_id));

--
------------------------------------------
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

#9Igor Katson
descentspb@gmail.com
In reply to: Hannu Krosing (#8)
Re: [Plproxy-users] A complex plproxy query

Hannu Krosing wrote:

On Thu, 2009-01-22 at 16:13 +0200, Hannu Krosing wrote:

On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:

Ok, thank you, guys. What is the best way to make an array out of a
column? I didn't make up anything better then writing a function:

CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
AS $$
DECLARE
arr int[];
rec int;
BEGIN
FOR rec IN EXECUTE query
LOOP
arr := array_append('{}',rec);
END LOOP;
RETURN arr;
END;
$$ language plpgsql;

hannu=# select ARRAY(select usename from pg_user);
?column?
-------------------------------
{postgres,hannu,m1,skyncuser}
(1 row)

So what yo need is

select * from
gather_friend_info_from_all_partitions(get_friend_id_list_form_user_partition(user_id));

Yes, after using arrays, I figured out perfectly, how to do that. And
thanks for the help!