Thread: A complex plproxy query

A complex plproxy query

From
Igor Katson
Date:
This is a complex question, and I couldn't form it in a short and easy
way, and I'm sorry for that.

First of all, let me introduce you to the DB (to form a question), for
you to understand what am I talking about. The module looks like a
social network, just the users have friends, which can be in different
groups.

Also it is clustered with PLPROXY by user_id, so the user itself, and
his friends list (the list of ID's) is always in the same DB, but the
information about the friends is not (it is clustered through all the
partitions). Here is a little sketch of a the needed tables:

CREATE TABLE friend
(
   id bigint,
   user_id integer,
   friend_id integer,
   group_id bigint,
...
);
This table is a 'friend link' from one user to another, which can be
marked as being in some 'group', and the backward link exists also (from
the 2nd user to the 1st), which can possibly be in another 'group'.

CREATE TABLE user
(
  user_id integer,
  nickname text,
.... -- lots of other info
);
This is just a user table.

Both of these are clustered by user_id. I need to form the following
query, for it to be as fast as possible (here it is written as if it the
DB was not partitioned):
SELECT something FROM user u, friend f
WHERE u.user_id = f.friend.id
AND f.user_id = $1 (this is given as an argument)
AND f.group_id = $2

So to say, give me the list of friends (not only their ID's, but all the
needed columns!) of given individual, which are in a given group. That
seems ok without plproxy, but with using it, I can't imagine how can I
form a nice query, or a function (or a set of plpgsql + plproxy
functions) to do the job.

Thanks in advance and regards,
Igor Katson.




Re: [Plproxy-users] A complex plproxy query

From
Hannu Krosing
Date:
On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:

> So to say, give me the list of friends (not only their ID's, but all the
> needed columns!) of given individual, which are in a given group. That
> seems ok without plproxy, but with using it, I can't imagine how can I
> form a nice query, or a function (or a set of plpgsql + plproxy
> functions) to do the job.

You need to do it in two steps - first run a query on the partition the
user is in to get list of friends ids, then run a second RUN ON ALL
query with

WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2

to gather all friend info in parallel

--
------------------------------------------
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training


Re: [Plproxy-users] A complex plproxy query

From
Igor Katson
Date:
Hannu Krosing wrote:
> On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:
>
>
>> So to say, give me the list of friends (not only their ID's, but all the
>> needed columns!) of given individual, which are in a given group. That
>> seems ok without plproxy, but with using it, I can't imagine how can I
>> form a nice query, or a function (or a set of plpgsql + plproxy
>> functions) to do the job.
>>
>
> You need to do it in two steps - first run a query on the partition the
> user is in to get list of friends ids, then run a second RUN ON ALL
> query with
>
> WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2
>
> to gather all friend info in parallel
>
>
I was thinking about that. But I don't understand, how can I pass the
list of id's. Should I turn the output of a select into an array? How
then? What if the array gets hundreds of items long?

Re: [Plproxy-users] A complex plproxy query

From
Marko Kreen
Date:
On 1/22/09, Igor Katson <descentspb@gmail.com> wrote:
> Hannu Krosing wrote:
>  > On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:
>  >
>  >
>  >> So to say, give me the list of friends (not only their ID's, but all the
>  >> needed columns!) of given individual, which are in a given group. That
>  >> seems ok without plproxy, but with using it, I can't imagine how can I
>  >> form a nice query, or a function (or a set of plpgsql + plproxy
>  >> functions) to do the job.
>  >>
>  >
>  > You need to do it in two steps - first run a query on the partition the
>  > user is in to get list of friends ids, then run a second RUN ON ALL
>  > query with
>  >
>  > WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2
>  >
>  > to gather all friend info in parallel
>  >
>  >
>
> I was thinking about that. But I don't understand, how can I pass the
>  list of id's. Should I turn the output of a select into an array? How
>  then? What if the array gets hundreds of items long?

Yes, array works fine.  And if it's long, then let it be long...

--
marko

Re: [Plproxy-users] A complex plproxy query

From
Igor Katson
Date:
Marko Kreen wrote:
> On 1/22/09, Igor Katson <descentspb@gmail.com> wrote:
>
>> Hannu Krosing wrote:
>>  > On Thu, 2009-01-22 at 02:33 +0300, Igor Katson wrote:
>>  >
>>  >
>>  >> So to say, give me the list of friends (not only their ID's, but all the
>>  >> needed columns!) of given individual, which are in a given group. That
>>  >> seems ok without plproxy, but with using it, I can't imagine how can I
>>  >> form a nice query, or a function (or a set of plpgsql + plproxy
>>  >> functions) to do the job.
>>  >>
>>  >
>>  > You need to do it in two steps - first run a query on the partition the
>>  > user is in to get list of friends ids, then run a second RUN ON ALL
>>  > query with
>>  >
>>  > WHERE f.friend.id in (list of ids from f1) AND f.group_id = $2
>>  >
>>  > to gather all friend info in parallel
>>  >
>>  >
>>
>> I was thinking about that. But I don't understand, how can I pass the
>>  list of id's. Should I turn the output of a select into an array? How
>>  then? What if the array gets hundreds of items long?
>>
>
> Yes, array works fine.  And if it's long, then let it be long...
>
>
Ok, thank you, guys. What is the best way to make an array out of a
column? I didn't make up anything better then writing a function:

CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
AS $$
    DECLARE
        arr int[];
        rec int;
    BEGIN
        FOR rec IN EXECUTE query
        LOOP
            arr := array_append('{}',rec);
        END LOOP;
        RETURN arr;
    END;
$$ language plpgsql;


Re: [Plproxy-users] A complex plproxy query

From
Hannu Krosing
Date:
On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:

> Ok, thank you, guys. What is the best way to make an array out of a
> column? I didn't make up anything better then writing a function:
>
> CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
> AS $$
>     DECLARE
>         arr int[];
>         rec int;
>     BEGIN
>         FOR rec IN EXECUTE query
>         LOOP
>             arr := array_append('{}',rec);
>         END LOOP;
>         RETURN arr;
>     END;
> $$ language plpgsql;

hannu=# select ARRAY(select usename from pg_user);
           ?column?
-------------------------------
 {postgres,hannu,m1,skyncuser}
(1 row)



--
------------------------------------------
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training


Re: [Plproxy-users] A complex plproxy query

From
Igor Katson
Date:
Hannu Krosing wrote:
> On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:
>
>
>> Ok, thank you, guys. What is the best way to make an array out of a
>> column? I didn't make up anything better then writing a function:
>>
>> CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
>> AS $$
>>     DECLARE
>>         arr int[];
>>         rec int;
>>     BEGIN
>>         FOR rec IN EXECUTE query
>>         LOOP
>>             arr := array_append('{}',rec);
>>         END LOOP;
>>         RETURN arr;
>>     END;
>> $$ language plpgsql;
>>
>
> hannu=# select ARRAY(select usename from pg_user);
>            ?column?
> -------------------------------
>  {postgres,hannu,m1,skyncuser}
> (1 row)
>
>
>
>
Lots of thanks! I tried the same one, but with ARRAY[], so i didn't get
anything.

Re: [Plproxy-users] A complex plproxy query

From
Hannu Krosing
Date:
On Thu, 2009-01-22 at 16:13 +0200, Hannu Krosing wrote:
> On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:
>
> > Ok, thank you, guys. What is the best way to make an array out of a
> > column? I didn't make up anything better then writing a function:
> >
> > CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
> > AS $$
> >     DECLARE
> >         arr int[];
> >         rec int;
> >     BEGIN
> >         FOR rec IN EXECUTE query
> >         LOOP
> >             arr := array_append('{}',rec);
> >         END LOOP;
> >         RETURN arr;
> >     END;
> > $$ language plpgsql;
>
> hannu=# select ARRAY(select usename from pg_user);
>            ?column?
> -------------------------------
>  {postgres,hannu,m1,skyncuser}
> (1 row)

So what yo need is

select * from
gather_friend_info_from_all_partitions(get_friend_id_list_form_user_partition(user_id));

--
------------------------------------------
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training


Re: [Plproxy-users] A complex plproxy query

From
Igor Katson
Date:
Hannu Krosing wrote:
> On Thu, 2009-01-22 at 16:13 +0200, Hannu Krosing wrote:
>
>> On Thu, 2009-01-22 at 15:10 +0300, Igor Katson wrote:
>>
>>
>>> Ok, thank you, guys. What is the best way to make an array out of a
>>> column? I didn't make up anything better then writing a function:
>>>
>>> CREATE OR REPLACE FUNCTION int_column_to_array(query text) RETURNS int[]
>>> AS $$
>>>     DECLARE
>>>         arr int[];
>>>         rec int;
>>>     BEGIN
>>>         FOR rec IN EXECUTE query
>>>         LOOP
>>>             arr := array_append('{}',rec);
>>>         END LOOP;
>>>         RETURN arr;
>>>     END;
>>> $$ language plpgsql;
>>>
>> hannu=# select ARRAY(select usename from pg_user);
>>            ?column?
>> -------------------------------
>>  {postgres,hannu,m1,skyncuser}
>> (1 row)
>>
>
> So what yo need is
>
> select * from
> gather_friend_info_from_all_partitions(get_friend_id_list_form_user_partition(user_id));
>
>
Yes, after using arrays, I figured out perfectly, how to do that. And
thanks for the help!