pgsql: Implement Self-Join Elimination - Mailing list pgsql-committers
From | Alexander Korotkov |
---|---|
Subject | pgsql: Implement Self-Join Elimination |
Date | |
Msg-id | E1tjyc0-007aUs-Hm@gemulon.postgresql.org Whole thread Raw |
List | pgsql-committers |
Implement Self-Join Elimination The Self-Join Elimination (SJE) feature removes an inner join of a plain table to itself in the query tree if it is proven that the join can be replaced with a scan without impacting the query result. Self-join and inner relation get replaced with the outer in query, equivalence classes, and planner info structures. Also, the inner restrictlist moves to the outer one with the removal of duplicated clauses. Thus, this optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses and, in turn, selectivity estimations, and potentially improves total planner prediction for the query. This feature is dedicated to avoiding redundancy, which can appear after pull-up transformations or the creation of an EquivalenceClass-derived clause like the below. SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3); SELECT * FROM t1 WHERE EXISTS (SELECT t3.x FROM t1 t3 WHERE t3.x = t1.x); SELECT * FROM t1,t2, t1 t3 WHERE t1.x = t2.x AND t2.x = t3.x; In the future, we could also reduce redundancy caused by subquery pull-up after unnecessary outer join removal in cases like the one below. SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3 LEFT JOIN t2 ON t2.x = t1.x); Also, it can drastically help to join partitioned tables, removing entries even before their expansion. The SJE proof is based on innerrel_is_unique() machinery. We can remove a self-join when for each outer row: 1. At most, one inner row matches the join clause; 2. Each matched inner row must be (physically) the same as the outer one; 3. Inner and outer rows have the same row mark. In this patch, we use the next approach to identify a self-join: 1. Collect all merge-joinable join quals which look like a.x = b.x; 2. Add to the list above the baseretrictinfo of the inner table; 3. Check innerrel_is_unique() for the qual list. If it returns false, skip this pair of joining tables; 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the possibility of self-join elimination, the inner and outer clauses must match exactly. The relation replacement procedure is not trivial and is partly combined with the one used to remove useless left joins. Tests covering this feature were added to join.sql. Some of the existing regression tests changed due to self-join removal logic. Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru Author: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Author: Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Co-authored-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Simon Riggs <simon@2ndquadrant.com> Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org> Reviewed-by: David Rowley <david.rowley@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Konstantin Knizhnik <k.knizhnik@postgrespro.ru> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Hywel Carver <hywel@skillerwhale.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Ronan Dunklau <ronan.dunklau@aiven.io> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Greg Stark <stark@mit.edu> Reviewed-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Michał Kłeczek <michal@kleczek.org> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/fc069a3a6319b5bf40d2f0f1efceae1c9b7a68a8 Modified Files -------------- doc/src/sgml/config.sgml | 16 + src/backend/optimizer/path/equivclass.c | 3 +- src/backend/optimizer/path/indxpath.c | 39 + src/backend/optimizer/plan/analyzejoins.c | 1240 ++++++++++++++++++++++++++--- src/backend/optimizer/plan/planmain.c | 5 + src/backend/optimizer/prep/prepunion.c | 9 +- src/backend/rewrite/rewriteManip.c | 126 ++- src/backend/utils/misc/guc_tables.c | 10 + src/include/nodes/pathnodes.h | 40 +- src/include/optimizer/optimizer.h | 2 + src/include/optimizer/paths.h | 3 + src/include/optimizer/planmain.h | 6 + src/include/rewrite/rewriteManip.h | 4 + src/test/regress/expected/equivclass.out | 30 + src/test/regress/expected/join.out | 1083 +++++++++++++++++++++++++ src/test/regress/expected/sysviews.out | 3 +- src/test/regress/sql/equivclass.sql | 16 + src/test/regress/sql/join.sql | 494 ++++++++++++ src/tools/pgindent/typedefs.list | 2 + 19 files changed, 2983 insertions(+), 148 deletions(-)
pgsql-committers by date: