On Tue, 26 Nov 2024 at 09:55, Ba Jinsheng <bajinsheng@u.nus.edu> wrote:
> TPC-DS query 95:
> Its execution time is nearly 1 min:
> CTE ws_wh
> -> Hash Join (cost=37772.14..198810.77 rows=7242361 width=12) (actual time=211.161..1443.926 rows=6644004
loops=1)
> If applying this patch:
> - if (is_innerrel_unique_for(root, joinrelids, outerrelids, innerrel,
> + if (!is_innerrel_unique_for(root, joinrelids, outerrelids, innerrel,
> The execution time is reduced to 6 seconds:
> CTE ws_wh
> -> Hash Join (cost=37772.14..74062.53 rows=7095248 width=12) (actual time=203.407..560.264 rows=719205
loops=1)
> The difference between both query plans is the second one uses Materialize instead of Memoize. From the code, it
seemsthat changing the usage of the cache brings performance improvement unexpectedly.
What's going on here is that you've introduced a new bug which has the
Hash Join effectively perform a semi-join rather than an inner join so
that it only joins the first matching inner-side row instead of all
matching inner-side rows for each outer-side row. The bug is fairly
evident when to see that the Hash Join produces 6644004 rows without
your change and only 719205 rows after you added the bug.
I'm not sure what you expect us to do here, but just in case you're
not aware, we're not going to prioritise performance over correct
results.
I expect the problem you've introduced will be more evident if you use
EXPLAIN ANALYZE VERBOSE. The verbose option will show which joins are
marked "Inner Unique". If you want to learn more about the
optimisation you've broken, see [1].
David
[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9c7f5229ad68d7e0e4dd149e3f80257893e404d4