Re: Performance issues - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Performance issues
Date
Msg-id 55083236.8070900@2ndquadrant.com
Whole thread Raw
In response to Re: Performance issues  (Vivekanand Joshi <vjoshi@zetainteractive.com>)
Responses Re: Performance issues
List pgsql-performance
Just as I feared, the attached explain analyze results show significant
misestimates, like this for example:

Nested Loop  (cost=32782.19..34504.16 rows=1 width=16)
        (actual time=337.484..884.438 rows=46454 loops=1)

Nested Loop  (cost=18484.94..20366.29 rows=1 width=776)
      (actual time=2445.487..3741.049 rows=45360 loops=1)

Hash Left Join  (cost=34679.90..37396.37 rows=11644 width=148)
          (actual time=609.472..9070.675 rows=4559289 loops=1)

There's plenty of nested loop joins - the optimizer believes there will
be only a few rows in the outer relation, but gets order of magnitude
more tuples. And nested loops are terrible in that case.

In case of the first view, it seems to be caused by this:

Merge Cond:
((s_f_touchpoint_execution_status_history.touchpoint_execution_id =
s_f_touchpoint_execution_status_history_1.touchpoint_ex
ecution_id) AND (s_f_touchpoint_execution_status_history.creation_dt =
(max(s_f_touchpoint_execution_status_history_1.creation_dt))))

especially the ':id = max(:id)' condition is probably giving the
optimizer a hard time. This is a conceptually difficult poblem (i.e.
fixing this at the optimizer level is unlikely to happen any time soon,
because it effectively means you have to predict the statistical
properties of the aggregation).

You may try increasing the statistical target, which makes the stats a
bit more detailed (the default on 9.4 is 100):

    SET default_statistics_target = 10000;
    ANALYZE;

But I don't really believe this might really fix the problem.

But maybe it's possible to rewrite the query somehow?

Let's experiment a bit - remove the aggregation, i.e. join directly to
s_f_touchpoint_execution_status_history. It'll return wrong results, but
the estimates should be better, so let's see what happens.

You may also try disabling nested loops - the other join algorithms
usually perform better with large row counts.

    SET enable_nestloop = false;

This is not a production-suitable solution, but for experimenting that's OK.

ISTM what the aggregation (or the whole mergejoin) does is selecting the
last s_f_touchpoint_execution_status_history record for each
touchpoint_execution_id.

There are better ways to determine that, IMHO. For example:

 (1) adding a 'is_last' flag to s_f_touchpoint_execution_status_history

     This however requires maintaining that flag somehow, but the join
     would not be needed at all.

     The "last IDs" might be maintained in a separate table - the join
     would be still necessary, but it might be less intrusive and
     cheper to maintain.

 (2) using window functions, e.g. like this:

     SELECT * FROM (
       SELECT *,
            ROW_NUMBER() OVER (PARTITION BY touchpoint_execution_id
                               ORDER BY FROM max_creation_dt) AS rn
       FROM s_f_touchpoint_execution_status_history
     ) foo WHERE rn = 1

     But estimating this is also rather difficult ...

 (3) Using temporary table / MV - this really depends on your
     requirements, load schedule, how you run the queries etc. It would
     however fix the estimation errors (probably).

The 2nd view seems to suffer because of the same issue (underestimates
leading to choice of nested loops), but caused by something else:

->  Hash Join  (cost=1954.13..6249.67 rows=13 width=108)
         (actual time=31.777..210.346 rows=72670 loops=1)
      Hash Cond: ((tp_exec.touchpoint_id = tp.touchpoint_id)
              AND (wave_exec.wave_id = tp.wave_id))

Estimating cardinality of joins with multi-column conditions is
difficult, no idea how to fix that at the moment.












--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-performance by date:

Previous
From: Vivekanand Joshi
Date:
Subject: Re: Performance issues
Next
From: Thomas Kellerer
Date:
Subject: Re: Performance issues