Re: Adding LEFT JOIN to a query has increased execution time 10 times - Mailing list pgsql-general

From Alexander Farber
Subject Re: Adding LEFT JOIN to a query has increased execution time 10 times
Date
Msg-id CAADeyWh5jLcRViB2rqYOiehVA1e0k0FD3oL19oGqLo6Z2TSuKg@mail.gmail.com
Whole thread Raw
In response to Re: Adding LEFT JOIN to a query has increased execution time 10 times  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: Adding LEFT JOIN to a query has increased execution time 10 times  (Alexander Farber <alexander.farber@gmail.com>)
List pgsql-general
Oh ok, so it is not as simple as eliminating all "Seq Scan" occurrences...

Thank you for replying Andrew -

On Sat, Jan 5, 2019 at 9:18 PM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

That seems slow in itself, even before adding the extra join - the
explain suggests that you're both short on indexes and you're getting
pretty bad plans, possibly due to exceeding join_collapse_limit.

(You might try increasing that in your config, along with
from_collapse_limit; the default values are a legacy of the days when
CPUs were much slower and planning time more of an issue.)

 AF> but after I have added a LEFT JOIN with the following table, the
 AF> query takes 7-10 seconds for completion and makes the game
 AF> unpleasant to play:

 AF>     # \d words_geoip;
 AF>                      Table "public.words_geoip"
 AF>      Column |       Type       | Collation | Nullable | Default
 AF>     --------+------------------+-----------+----------+---------
 AF>      block  | inet             |           | not null |
 AF>      lat    | double precision |           |          |
 AF>      lng    | double precision |           |          |
 AF>     Indexes:
 AF>         "words_geoip_pkey" PRIMARY KEY, btree (block)

And here's yet another missing index, resulting in your query having to
process and discard 27 million rows in the course of generating a result
of only 9 rows:

  Join Filter: (CASE WHEN (g.player1 = 5) THEN u2.ip ELSE u1.ip END << i2.block)
  Rows Removed by Join Filter: 27660682

(you probably wanted <<= rather than << as that comparison, if there's
any chance your geoip table might have entries for single IPs)

Fortunately, this being pg10, you can use either of these indexes:

CREATE INDEX ON words_geoip USING gist (block inet_ops);

or

CREATE INDEX ON words_geoip USING spgist (block);

As for the rest of the query, here are places you could probably
work on:

 AF>     LEFT JOIN words_moves m ON m.gid = g.gid
 AF>             AND NOT EXISTS (SELECT 1
 AF>                     FROM words_moves m2
 AF>                     WHERE m2.gid = m.gid
 AF>                     AND m2.played > m.played)

Whar you're asking for here is that the words_moves row that you're
joining not have a matching row with a larger "played" value. You can do
this far more efficiently with a lateral join, given the right index.

 AF>     LEFT JOIN words_social s1 ON s1.uid = 5
 AF>             AND NOT EXISTS (SELECT 1
 AF>                     FROM words_social s
 AF>                     WHERE s1.uid = s.uid
 AF>                     AND s.stamp > s1.stamp)
 AF>     LEFT JOIN words_social s2 ON s2.uid = (CASE WHEN g.player1 = 5 THEN
 AF> g.player2 ELSE g.player1 END)
 AF>             AND NOT EXISTS (SELECT 1
 AF>                     FROM words_social s
 AF>                     WHERE s2.uid = s.uid
 AF>                     AND s.stamp > s2.stamp)

Similar considerations apply to both of the above.

 AF>     WHERE 5 IN (g.player1, g.player2)
 AF>     AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1
 AF> day');

This WHERE clause could be written as

  WHERE 5 IN (g.player1, g.player2)
    AND coalesce(g.finished,'infinity') > (current_timestamp - interval '1 day')

and you could then create the following indexes,

CREATE INDEX ON words_games (player1, coalesce(finished,'infinity'));
CREATE INDEX ON words_games (player2, coalesce(finished,'infinity'));

which should get you a BitmapOr plan for that condition.

 AF> I have also asked my question at [dba.stack]

If you ask questions like this on the IRC channel (#postgresql on
chat.freenode.net - see http://freenode.net for info or web-based client
access), you can usually get feedback in real time (I rarely answer
performance questions in email because getting responses just takes too
long). You may have to be patient.


I will try to digest your information and to follow up... Thanks again

For IRC I am unfortunately too tired right now (evening in Germany) 

Regards
Alex

pgsql-general by date:

Previous
From: Alexander Farber
Date:
Subject: Re: Adding LEFT JOIN to a query has increased execution time 10 times
Next
From: Mitar
Date:
Subject: Re: Dropping and creating a trigger