Re: 9.5 regression with unwanted nested loop left join - Mailing list pgsql-general

From Tom Lane
Subject Re: 9.5 regression with unwanted nested loop left join
Date
Msg-id 31696.1464103144@sss.pgh.pa.us
Whole thread Raw
In response to 9.5 regression with unwanted nested loop left join  (Greg Sabino Mullane <greg@endpoint.com>)
Responses Re: 9.5 regression with unwanted nested loop left join
List pgsql-general
Greg Sabino Mullane <greg@endpoint.com> writes:
> We are trying to upgrade a client app to Postgres 9.5, but are running
> into some performance regression issues (even though the curent db is 8.x!).
> One in particular that is puzzling me involves a query that keeps slipping
> into a nested loop left join, rather than a much preferred hash join.

The core of the problem seems to be here:

>          ->  Hash Join  (C=20929.52..44799.53 R=1 W=4) (AT=626.631..1009.945 R=124 L=1)
>                Hash Cond: ((gs.id = gregtest_status.id) AND (gs.update_time = (min(gregtest_status.update_time))))

which is implementing the gs/gs2 join here:

>            SELECT gs.id, gs.status
>            FROM gregtest_status gs
>            JOIN (
>                  SELECT id, min(update_time) AS update_time
>                  FROM gregtest_status
>                  WHERE update_time >= '2015-01-01'::date
>                  GROUP BY id
>            ) gs2 ON gs.id = gs2.id AND gs.update_time = gs2.update_time

I tried this back to 8.4, and all versions predict just one row returned
from the gs/gs2 join, whereas in reality you get 124 rows, ie, all rows
produced by gs2 have matches in gs.

The plan 9.0 and up produce is not just a nestloop above this, but a
nestloop with inner materialization step.  It would clearly not be
sensible to run this hashjoin many times, but materializing its output
gets rid of the cost-of-data problem --- and then, if you believe there's
only gonna be one row out, a simple nestloop looks cheaper than building
a hash table, at least up to a fairly large number of rows on the other
side.

8.4 avoids this trap only because it doesn't consider injecting a
materialize there.

So a brute-force fix to restore the pre-9.0 behavior would be
"set enable_material = off".  But really the problem is that it's
unobvious that all rows in the gs2 output would have matches in gs.
I wonder if there's a way to reformulate the query to make that better.
Do you even need the update_time check?

Assuming that the application's already been optimized for pre-9.0
Postgres, turning off enable_material might not be a disastrous long
term solution for it, though certainly it'd be better if you can move
away from that eventually.

            regards, tom lane


pgsql-general by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: 9.5 regression with unwanted nested loop left join
Next
From: parihaaraka
Date:
Subject: Re: pg_upgrade error regarding hstore operator