On Tue, 21 Feb 2023 at 14:38, Charles <peacech@gmail.com> wrote:
> Disabling mergejoin on 14.6 and disabling nestedloop on 15.2 causes both to use hashjoin where it runs for 37ms in
14.6and 208ms in 15.2.
>
> 14.6:
> -> Parallel Index Scan using idx_stock_price_date on stock_price (cost=0.43..59671.39
rows=81248width=13) (actual time=0.021..0.689 rows=1427 loops=3)
> Index Cond: (date > $0)
> Filter: (value > 0)
> Rows Removed by Filter: 222
> 15.2:
> -> Parallel Seq Scan on stock_price (cost=0.00..64038.54 rows=91275 width=13)
(actualtime=130.043..173.124 rows=1427 loops=3)
> Filter: ((value > 0) AND (date > $0))
> Rows Removed by Filter: 906975
The difference in the two above fragments likely accounts for the
majority of the remaining performance difference. Possibly 15.2 is
using a Seq Scan because it's estimating slightly more rows from
stock_price for these two quals. For the date > $0 qual, the stats
don't really help as the planner does not know what $0 will be during
planning, so it'll just assume that the selectivity is 1/3rd of rows.
For the value > 0, there could be some variation there just between
ANALYZE runs. That might be enough to account for the difference in
estimate between 14 and 15.
You might also want to check that effective_cache_size is set to
something realistic on 15. random_page_cost is also a factor for index
scan vs seq scan. However, ultimately, the estimate is bad here and
adjusting those two GUCs to make this plan do what you want might be
to the detriment of other plans where the statistics are more
accurate. You could maybe get the planner to give you the v14 plan
if you changed the CTE qual to "where value + 0 > 0", that would mean
the planner would be unable to use the statistics, which presumably
say that most values are > 0, and it would apply the 0.3333 to both
those quals which should reduce the row estimate. However, that's not
to say it won't make things bad again for some other "date" value.
David