Re: newsfeed type query - Mailing list pgsql-general

From Ladislav Lenart
Subject Re: newsfeed type query
Date
Msg-id 5540B873.8010102@volny.cz
Whole thread Raw
In response to Re: newsfeed type query  (Jonathan Vanasco <postgres@2xlp.com>)
Responses Re: newsfeed type query
Re: newsfeed type query
List pgsql-general
Hello.

On 29.4.2015 01:57, Jonathan Vanasco wrote:
> Sorry, I was trying to ask something very abstract as I have similar situations
> on multiple groups of queries/tables (and they're all much more complex).
>
> I'm on pg 9.3
>
> The relevant structure is:
>
>     posting:
>         id
>         timestamp_publish
>         group_id__in
>         user_id__author
>
>     friends:
>         user_id__a
>         user_id__b
>
>     memberships:
>         user_id
>         group_id
>         role_id
>
>
> -- working sql
>                 CREATE TABLE groups(
>                     id SERIAL NOT NULL PRIMARY KEY
>                 );
>                 CREATE TABLE users(
>                     id SERIAL NOT NULL PRIMARY KEY
>                 );
>                 CREATE TABLE friends (
>                     user_id__a INT NOT NULL REFERENCES users( id ),
>                     user_id__b INT NOT NULL REFERENCES users( id )
>                 );
>                 CREATE TABLE memberships (
>                     user_id INT NOT NULL REFERENCES users( id ),
>                     group_id INT NOT NULL REFERENCES groups( id ),
>                     role_id INT NOT NULL
>                 );
>                 CREATE TABLE posting (
>                     id SERIAL NOT NULL,
>                     timestamp_publish timestamp not null,
>                     group_id__in INT NOT NULL REFERENCES groups(id),
>                     user_id__author INT NOT NULL REFERENCES users(id),
>                     is_published BOOL
>                 );
>
> The output that I'm trying to get is:
>     posting.id
>     {the context of the select}
>     posting.timestamp_publish (this may need to get correlated into other queries)
>
>
> These approaches had bad performance:
>
>     -- huge selects / memory
>     -- it needs to load everything from 2 tables before it limits
>     EXPLAIN ANALYZE
>     SELECT id, feed_context FROM (
>         SELECT  id, timestamp_publish, 'in-group' AS feed_context FROM posting
>         WHERE (
>             group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57
> AND role_id IN (1,2,3))
>             AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP
> AT TIME ZONE 'UTC')
>             )
>         UNION
>         SELECT  id, timestamp_publish, 'by-user' AS feed_context FROM posting
>         WHERE (
>             user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a
> = 57)
>             AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP
> AT TIME ZONE 'UTC')
>             )
>     ) AS feed
>     ORDER BY  timestamp_publish DESC
>     LIMIT 10
>     ;


I think you can propagate ORDER BY and LIMIT also to the subqueries of the
UNION, i.e.:

select...
from (
    (
        select...
        from posting
        where... -- friends
        order by timestamp_publish desc
        limit 10
    ) union (
    (
        select...
        from posting
        where... -- groups
        order by timestamp_publish desc
        limit 10
    )
) as feed
order by timestamp_publish desc
limit 10

That might behave better.


Ladislav Lenart


>     -- selects minimized, but repetitive subqueries
>     SELECT
>         id,
>         CASE
>             WHEN group_id__in IN (SELECT group_id FROM memberships WHERE user_id
> = 57 AND role_id IN (1,2,3)) THEN True
>             ELSE NULL
>         END AS feed_context_group,
>         CASE
>             WHEN user_id__author IN (SELECT user_id__b FROM friends WHERE
> user_id__a = 57) THEN True
>             ELSE NULL
>         END AS feed_context_user
>         FROM posting
>     WHERE (
>             group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57
> AND role_id IN (1,2,3))
>             OR
>             user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a
> = 57)
>         )
>         AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT
> TIME ZONE 'UTC')
>     ORDER BY  timestamp_publish DESC
>     LIMIT 10
>     ;
>
>
>
> On Apr 28, 2015, at 6:56 PM, Melvin Davidson wrote:
>
>> Since you very nicely DID NOT provide the pg version, O/S or table
>> structure(s), which is what you should do REGARDLESS of the
>> type of question (it's just the smart and polite thing to do when asking for
>> help) The best  I can suggest is:
>> SELECT
>>   CASE WHEN context = 'friend' THEN p.junka
>>             WHEN context = 'group' THEN p.junkb
>>             WHEN context = 'both'   THEN p.junka || ' ' || p.junkb
>>     END
>>    FROM posting p
>>   where p.author_id in (SELECT f.friend_id
>>                                     FROM friends f
>>                                   WHERE f.user_id = ?)
>>        OR p.group_id in (SELECT m.group_id
>>                                     FROM memberships m
>>                                   WHERE m.user_id = ?);
>
>




pgsql-general by date:

Previous
From: Ladislav Lenart
Date:
Subject: Re: newsfeed type query
Next
From: Magnus Hagander
Date:
Subject: Re: Upgrading hot standbys