Thread: 9.5 regression with unwanted nested loop left join

9.5 regression with unwanted nested loop left join

From
Greg Sabino Mullane
Date:
We are trying to upgrade a client app to Postgres 9.5, but are running
into some performance regression issues (even though the curent db is 8.x!).
One in particular that is puzzling me involves a query that keeps slipping
into a nested loop left join, rather than a much preferred hash join.
The nested loop is many orders of magnitude slower than the hash. My
testing showed that the 8.x series handles the query fine, but as of 9.0
and up it slips into the nested loop when certain criteria are met; if
the where clause asks for a certain number of dates. The number varies,
but in general anything less than 130 days goes to the slow nested loop.

I made a test case that illustrates the problem. Yes, it could be written
better to alleviate the issue, but I'd rather find some way of tuning the
db (yes I said the T word) rather than rewriting queries one by one.

Setup:

DROP TABLE gregtest, gregtest_status;
CREATE TABLE gregtest(id INT, order_date TIMESTAMPTZ, status TEXT);
INSERT INTO gregtest SELECT (123*RANDOM())::INT, NOW() - '1 DAY'::INTERVAL * 1234*RANDOM(),
  (10*RANDOM())::INT FROM GENERATE_SERIES(1,500000);
CREATE TABLE gregtest_status(id INT, update_time TIMESTAMPTZ, status TEXT);
INSERT INTO gregtest_status SELECT (123*RANDOM())::INT, NOW() - '1 DAY'::INTERVAL * 1234*RANDOM(),
  (10*RANDOM())::INT FROM GENERATE_SERIES(1,1000000);
ANALYZE;

Fast and slow examples:

EXPLAIN ANALYZE SELECT .* FROM gregtest t
LEFT JOIN (
           SELECT gs.id, gs.status
           FROM gregtest_status gs
           JOIN (
                 SELECT id, min(update_time) AS update_time
                 FROM gregtest_status
                 WHERE update_time >= '2015-01-01'::date
                 GROUP BY id
           ) gs2 ON gs.id = gs2.id AND gs.update_time = gs2.update_time
      ) foo ON t.id = foo.id
WHERE t.order_date > '2016-01-20';
                                                                    QUERY PLAN
                          

------------------------------------------------------------------------------------------------------------------------
 Hash Right Join  (C=30990.39..54869.49 R=50070 W=14) (AT=842.020..1261.843 R=50503 L=1)
   Hash Cond: (gs.id = t.id)
   ->  Hash Join  (C=20929.52..44799.53 R=1 W=4) (AT=624.288..1016.711 R=124 L=1)
         Hash Cond: ((gs.id = gregtest_status.id) AND (gs.update_time = (min(gregtest_status.update_time))))
         ->  Seq Scan on gregtest_status gs  (C=0.00..16370.00 R=1000000 W=12) (AT=0.012..155.630 R=1000000 L=1)
         ->  Hash  (C=20928.74..20928.74 R=52 W=12) (AT=614.657..614.657 R=124 L=1)
               Buckets: 1024  Batches: 1  Memory Usage: 14kB
               ->  HashAggregate  (C=20927.70..20928.22 R=52 W=12) (AT=614.568..614.603 R=124 L=1)
                     Group Key: gregtest_status.id
                     ->  Seq Scan on gregtest_status  (C=0.00..18870.00 R=411540 W=12) (AT=0.025..417.488 R=412301 L=1)
                           Filter: (update_time >= '2015-01-01'::date)
                           Rows Removed by Filter: 587699
   ->  Hash  (C=9435.00..9435.00 R=50070 W=14) (AT=217.594..217.594 R=50503 L=1)
         Buckets: 65536  Batches: 1  Memory Usage: 2981kB
         ->  Seq Scan on gregtest t  (C=0.00..9435.00 R=50070 W=14) (AT=0.028..196.038 R=50503 L=1)
               Filter: (order_date > '2016-01-20 00:00:00-05'::timestamp with time zone)
               Rows Removed by Filter: 449497
 Planning time: 0.359 ms
 Execution time: 1265.709 ms

EXPLAIN ANALYZE SELECT t.* FROM gregtest t
LEFT JOIN (
           SELECT gs.id, gs.status
           FROM gregtest_status gs
           JOIN (
                 SELECT id, min(update_time) AS update_time
                 FROM gregtest_status
                 WHERE update_time >= '2015-01-01'::date
                 GROUP BY id
           ) gs2 ON gs.id = gs2.id AND gs.update_time = gs2.update_time
      ) foo ON t.id = foo.id
WHERE t.order_date > '2016-01-24';
                                                                       QUERY PLAN
                                

--------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (C=20929.52..54961.99 R=48497 W=14) (AT=948.436..2871.727 R=48856 L=1)
   Join Filter: (t.id = gs.id)
   Rows Removed by Join Filter: 6009288
   ->  Seq Scan on gregtest t  (C=0.00..9435.00 R=48497 W=14) (AT=0.023..113.814 R=48856 L=1)
         Filter: (order_date > '2016-01-24 00:00:00-05'::timestamp with time zone)
         Rows Removed by Filter: 451144
   ->  Materialize  (C=20929.52..44799.54 R=1 W=4) (AT=0.013..0.033 R=124 L=48856)
         ->  Hash Join  (C=20929.52..44799.53 R=1 W=4) (AT=626.631..1009.945 R=124 L=1)
               Hash Cond: ((gs.id = gregtest_status.id) AND (gs.update_time = (min(gregtest_status.update_time))))
               ->  Seq Scan on gregtest_status gs  (C=0.00..16370.00 R=1000000 W=12) (AT=0.010..150.791 R=1000000 L=1)
               ->  Hash  (C=20928.74..20928.74 R=52 W=12) (AT=617.172..617.172 R=124 L=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 14kB
                     ->  HashAggregate  (C=20927.70..20928.22 R=52 W=12) (AT=617.085..617.113 R=124 L=1)
                           Group Key: gregtest_status.id
                           ->  Seq Scan on gregtest_status  (C=0.00..18870.00 R=411540 W=12) (AT=0.013..423.964
R=412301L=1) 
                                 Filter: (update_time >= '2015-01-01'::date)
                                 Rows Removed by Filter: 587699
 Planning time: 0.970 ms
 Execution time: 2875.785 ms


Sometimes the final date in the where clause must be adjusted to see the nested loop.

The actual production query is > 200x as slow.

Yes, the 2015 date is hardcoded in there. All that changes is the final date in the WHERE clause.

Adding indexes does not seem to help, so this example left them out.

Bumping default_statistics_target to 5000 merely moves the first appearance of nested loop *earlier* a few days.

Changing cpu_tuple_cost moves it forward a bit, but not enough to cover the actual use case (now() - 1 month).

Leaving out the 'analyze' results in a much better plan, but having the tables stay unanalyzed does not seem
like a great solution. :) I'm hoping I am overlooking something obvious.


--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

Attachment

Re: 9.5 regression with unwanted nested loop left join

From
Tom Lane
Date:
Greg Sabino Mullane <greg@endpoint.com> writes:
> We are trying to upgrade a client app to Postgres 9.5, but are running
> into some performance regression issues (even though the curent db is 8.x!).
> One in particular that is puzzling me involves a query that keeps slipping
> into a nested loop left join, rather than a much preferred hash join.

The core of the problem seems to be here:

>          ->  Hash Join  (C=20929.52..44799.53 R=1 W=4) (AT=626.631..1009.945 R=124 L=1)
>                Hash Cond: ((gs.id = gregtest_status.id) AND (gs.update_time = (min(gregtest_status.update_time))))

which is implementing the gs/gs2 join here:

>            SELECT gs.id, gs.status
>            FROM gregtest_status gs
>            JOIN (
>                  SELECT id, min(update_time) AS update_time
>                  FROM gregtest_status
>                  WHERE update_time >= '2015-01-01'::date
>                  GROUP BY id
>            ) gs2 ON gs.id = gs2.id AND gs.update_time = gs2.update_time

I tried this back to 8.4, and all versions predict just one row returned
from the gs/gs2 join, whereas in reality you get 124 rows, ie, all rows
produced by gs2 have matches in gs.

The plan 9.0 and up produce is not just a nestloop above this, but a
nestloop with inner materialization step.  It would clearly not be
sensible to run this hashjoin many times, but materializing its output
gets rid of the cost-of-data problem --- and then, if you believe there's
only gonna be one row out, a simple nestloop looks cheaper than building
a hash table, at least up to a fairly large number of rows on the other
side.

8.4 avoids this trap only because it doesn't consider injecting a
materialize there.

So a brute-force fix to restore the pre-9.0 behavior would be
"set enable_material = off".  But really the problem is that it's
unobvious that all rows in the gs2 output would have matches in gs.
I wonder if there's a way to reformulate the query to make that better.
Do you even need the update_time check?

Assuming that the application's already been optimized for pre-9.0
Postgres, turning off enable_material might not be a disastrous long
term solution for it, though certainly it'd be better if you can move
away from that eventually.

            regards, tom lane


Re: 9.5 regression with unwanted nested loop left join

From
Greg Sabino Mullane
Date:
On Tue, May 24, 2016 at 11:19:04AM -0400, Tom Lane wrote:
> 8.4 avoids this trap only because it doesn't consider injecting a
> materialize there.
>
> So a brute-force fix to restore the pre-9.0 behavior would be
> "set enable_material = off".  But really the problem is that it's
> unobvious that all rows in the gs2 output would have matches in gs.
> I wonder if there's a way to reformulate the query to make that better.
> Do you even need the update_time check?

Thank you for the quick reply. Turning off enable_material does indeed
fix the problem on production, so it is good to know we have that
in our toolbox now. The update_time check is pretty buried in a view
inside a much more complicated query, and is needed for some other parts
of the query I left out to keep it simple.

> Assuming that the application's already been optimized for pre-9.0
> Postgres, turning off enable_material might not be a disastrous long
> term solution for it, though certainly it'd be better if you can move
> away from that eventually.

Yes, we are evaluating whether we should set this, or rewrite some queries
and see if this issue pops up anywhere else. Leaning towards the latter,
as I'm sure the planner *usually* does better with all options enabled. :)

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

Attachment