Re: Same condition in the CTE and in the subsequent JOIN using it - Mailing list pgsql-general

From Alban Hertroys
Subject Re: Same condition in the CTE and in the subsequent JOIN using it
Date
Msg-id 33EFAA18-E3DB-4C8F-850C-205A6325E4EE@gmail.com
Whole thread Raw
In response to Same condition in the CTE and in the subsequent JOIN using it  (Alexander Farber <alexander.farber@gmail.com>)
Responses Re: Same condition in the CTE and in the subsequent JOIN using it  (Alexander Farber <alexander.farber@gmail.com>)
List pgsql-general
> On 25 Apr 2018, at 17:45, Alexander Farber <alexander.farber@gmail.com> wrote:

(…)

> And here is the function source code:
>
> CREATE OR REPLACE FUNCTION words_stat_scores(
>                 in_social integer,
>                 in_sid    text
>         ) RETURNS TABLE (
>                 out_day   text,
>                 out_diff  numeric,
>                 out_score numeric
>         ) AS
> $func$
>         WITH cte AS (
>                 SELECT
>                         DATE_TRUNC('day', m.played) AS day,
>                         m.mid,
>                         EXTRACT(EPOCH FROM m.played - LAG(m.played) OVER (PARTITION BY m.gid ORDER BY
m.played))::int/60AS diff 
>                 FROM    words_moves m
>                 JOIN    words_games g ON (m.gid = g.gid)
>                 JOIN    words_social s ON (s.uid IN (g.player1, g.player2))
>                 WHERE   s.social = in_social                           -- CAN THIS BE REFERRED TO FROM BELOW?
>                 AND     s.sid = in_sid
>                 AND     m.played > CURRENT_TIMESTAMP - interval '1 month'
>         )
>         SELECT
>                 TO_CHAR(c.day, 'DD.MM.YYYY'),
>                 ROUND(AVG(c.diff)),
>                 ROUND(AVG(m.score), 1)
>         FROM    words_moves m
>         JOIN    cte c using(mid)
>         JOIN    words_social s USING(uid)
>         WHERE   s.social = in_social
>         AND     s.sid = in_sid
>         AND     m.action = 'play'
>         GROUP BY c.day
>         ORDER BY c.day;
>
> $func$ LANGUAGE sql STABLE;
>
> By looking at the above source code, do you think, that the condition being used twice (the s.social = in_social AND
s.sid= in_sid) is "too much" and can be optimized? :-) 

Actually, no. The conditions are part of different joins.

Within the CTE, you have a join that boils down to:

>   FROM    words_games g ON (m.gid = g.gid)
>   JOIN    words_social s ON (s.uid IN (g.player1, g.player2) AND s.social = in_social AND s.sid = in_sid)


In your outer query, you have:

>   FROM    words_moves m
>   JOIN    words_social s ON (s.uid = m.uid AND s.social = in_social AND s.sid = in_sid)


The joins are on different fields, in different tables even, so you can't just leave the conditions out because they
filterdifferent rows. 

What you _can_ do is move the words_social JOIN and it's conditions into a new CTE and join with that instead.
Somethinglike so: 

WITH words_in_social AS (
    SELECT sid, uid
      FROM words_social
     WHERE social = in_social
       AND sid = in_sid
),
cte AS (
    SELECT
        DATE_TRUNC('day', m.played) AS day,
        m.mid,
        EXTRACT(EPOCH FROM m.played - LAG(m.played) OVER (PARTITION BY m.gid ORDER BY m.played))::int/60 AS diff
      FROM    words_moves m
      JOIN    words_games g ON (m.gid = g.gid)
      JOIN    words_in_social s ON (s.uid IN (g.player1, g.player2))
      WHERE   m.played > CURRENT_TIMESTAMP - interval '1
)
SELECT
    TO_CHAR(c.day, 'DD.MM.YYYY'),
    ROUND(AVG(c.diff)),
    ROUND(AVG(m.score), 1)
  FROM words_moves m
  JOIN cte c using(mid)
  JOIN words_in_social s USING(uid)
 WHERE m.action = 'play'
 GROUP BY c.day
 ORDER BY c.day;

That may look cleaner, but your original query probably performs better, since CTE's also act as an optimisation fence.

BTW, I suggest to use a better name for your CTE than cte; I'd rather use a name that clarifies its purpose.

> Thank you for any hints, I apologize if my question is too specific and difficult to answer...
>
> Regards
> Alex

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



pgsql-general by date:

Previous
From: Vincent Veyron
Date:
Subject: Re: Same condition in the CTE and in the subsequent JOIN using it
Next
From: "Ahmed, Nawaz"
Date:
Subject: RE: Postgresql database encryption