Thread: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate This commit implements the automatic conversion of 'x IN (VALUES ...)' into ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance of an unnecessary join. Since VALUES describes a relational table, and the value of such a list is a table row, the optimizer will likely face an underestimation problem due to the inability to estimate cardinality through MCV statistics. The cardinality evaluation mechanism can work with the array inclusion check operation. If the array is small enough (< 100 elements), it will perform a statistical evaluation element by element. We perform the transformation in the convert_ANY_sublink_to_join() if VALUES RTE is proper and the transformation is convertible. The conversion is only possible for operations on scalar values, not rows. Also, we currently support the transformation only when it ends up with a constant array. Otherwise, the evaluation of non-hashed SAOP might be slower than the corresponding Hash Join with VALUES. Discussion: https://postgr.es/m/0184212d-1248-4f1f-a42d-f5cb1c1976d2%40tantorlabs.com Author: Alena Rybakina <a.rybakina@postgrespro.ru> Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/c0962a113d1f2f94cb7222a7ca025a67e9ce3860 Modified Files -------------- src/backend/optimizer/plan/subselect.c | 80 ++++++++ src/backend/optimizer/prep/prepjointree.c | 12 ++ src/backend/optimizer/util/clauses.c | 14 +- src/include/optimizer/subselect.h | 3 + src/test/regress/expected/subselect.out | 308 ++++++++++++++++++++++++++++++ src/test/regress/sql/subselect.sql | 100 ++++++++++ 6 files changed, 512 insertions(+), 5 deletions(-)
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Melanie Plageman
Date:
On Fri, Apr 4, 2025 at 9:17 AM Alexander Korotkov <akorotkov@postgresql.org> wrote: > > Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate > > This commit implements the automatic conversion of 'x IN (VALUES ...)' into > ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance > of an unnecessary join. I haven't looked at this patch, but it seems likely that it is related to the recent failures I noticed in CI in the subselect test when the regress suite is run by 002_pg_upgrade.pl diff --strip-trailing-cr -U3 C:/cirrus/src/test/regress/expected/subselect.out C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out --- C:/cirrus/src/test/regress/expected/subselect.out 2025-04-04 14:44:17.637206600 +0000 +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out 2025-04-04 14:47:20.358393500 +0000 @@ -2769,15 +2769,16 @@ EXPLAIN (COSTS OFF) SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) WHERE c.oid IN (VALUES (1), (2)); - QUERY PLAN ---------------------------------------------------------------- - Hash Join - Hash Cond: (a.oid = c.oid) - -> Seq Scan on pg_am a - -> Hash - -> Index Scan using pg_class_oid_index on pg_class c - Index Cond: (oid = ANY ('{1,2}'::oid[])) -(6 rows) + QUERY PLAN +--------------------------------------------------------- + Merge Join + Merge Cond: (c.oid = a.oid) + -> Index Scan using pg_class_oid_index on pg_class c + Index Cond: (oid = ANY ('{1,2}'::oid[])) + -> Sort + Sort Key: a.oid + -> Seq Scan on pg_am a +(7 rows)
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Melanie Plageman
Date:
On Fri, Apr 4, 2025 at 9:17 AM Alexander Korotkov <akorotkov@postgresql.org> wrote: > > Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate > > This commit implements the automatic conversion of 'x IN (VALUES ...)' into > ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance > of an unnecessary join. I haven't looked at this patch, but it seems likely that it is related to the recent failures I noticed in CI in the subselect test when the regress suite is run by 002_pg_upgrade.pl diff --strip-trailing-cr -U3 C:/cirrus/src/test/regress/expected/subselect.out C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out --- C:/cirrus/src/test/regress/expected/subselect.out 2025-04-04 14:44:17.637206600 +0000 +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out 2025-04-04 14:47:20.358393500 +0000 @@ -2769,15 +2769,16 @@ EXPLAIN (COSTS OFF) SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) WHERE c.oid IN (VALUES (1), (2)); - QUERY PLAN ---------------------------------------------------------------- - Hash Join - Hash Cond: (a.oid = c.oid) - -> Seq Scan on pg_am a - -> Hash - -> Index Scan using pg_class_oid_index on pg_class c - Index Cond: (oid = ANY ('{1,2}'::oid[])) -(6 rows) + QUERY PLAN +--------------------------------------------------------- + Merge Join + Merge Cond: (c.oid = a.oid) + -> Index Scan using pg_class_oid_index on pg_class c + Index Cond: (oid = ANY ('{1,2}'::oid[])) + -> Sort + Sort Key: a.oid + -> Seq Scan on pg_am a +(7 rows)
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Alexander Korotkov
Date:
On Fri, Apr 4, 2025 at 6:47 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Apr 4, 2025 at 9:17 AM Alexander Korotkov > <akorotkov@postgresql.org> wrote: > > > > Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate > > > > This commit implements the automatic conversion of 'x IN (VALUES ...)' into > > ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance > > of an unnecessary join. > > I haven't looked at this patch, but it seems likely that it is related > to the recent failures I noticed in CI in the subselect test when the > regress suite is run by 002_pg_upgrade.pl Thank you for pointing. I'm looking right now. ------ Regards, Alexander Korotkov Supabase
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Alexander Korotkov
Date:
On Fri, Apr 4, 2025 at 6:47 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Apr 4, 2025 at 9:17 AM Alexander Korotkov > <akorotkov@postgresql.org> wrote: > > > > Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate > > > > This commit implements the automatic conversion of 'x IN (VALUES ...)' into > > ScalarArrayOpExpr. That simplifies the query tree, eliminating the appearance > > of an unnecessary join. > > I haven't looked at this patch, but it seems likely that it is related > to the recent failures I noticed in CI in the subselect test when the > regress suite is run by 002_pg_upgrade.pl Thank you for pointing. I'm looking right now. ------ Regards, Alexander Korotkov Supabase
On Mon, 7 Apr 2025 at 19:39, Melanie Plageman <melanieplageman@gmail.com> wrote: > +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > 2025-04-04 14:47:20.358393500 +0000 > @@ -2769,15 +2769,16 @@ > EXPLAIN (COSTS OFF) > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > WHERE c.oid IN (VALUES (1), (2)); > - QUERY PLAN > ---------------------------------------------------------------- > - Hash Join > - Hash Cond: (a.oid = c.oid) > - -> Seq Scan on pg_am a > - -> Hash > - -> Index Scan using pg_class_oid_index on pg_class c > - Index Cond: (oid = ANY ('{1,2}'::oid[])) > -(6 rows) > + QUERY PLAN > +--------------------------------------------------------- > + Merge Join > + Merge Cond: (c.oid = a.oid) > + -> Index Scan using pg_class_oid_index on pg_class c > + Index Cond: (oid = ANY ('{1,2}'::oid[])) > + -> Sort > + Sort Key: a.oid > + -> Seq Scan on pg_am a > +(7 rows) Are these failures from patches applied to master prior to 3ba2cdaa? David
On Mon, 7 Apr 2025 at 19:39, Melanie Plageman <melanieplageman@gmail.com> wrote: > +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > 2025-04-04 14:47:20.358393500 +0000 > @@ -2769,15 +2769,16 @@ > EXPLAIN (COSTS OFF) > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > WHERE c.oid IN (VALUES (1), (2)); > - QUERY PLAN > ---------------------------------------------------------------- > - Hash Join > - Hash Cond: (a.oid = c.oid) > - -> Seq Scan on pg_am a > - -> Hash > - -> Index Scan using pg_class_oid_index on pg_class c > - Index Cond: (oid = ANY ('{1,2}'::oid[])) > -(6 rows) > + QUERY PLAN > +--------------------------------------------------------- > + Merge Join > + Merge Cond: (c.oid = a.oid) > + -> Index Scan using pg_class_oid_index on pg_class c > + Index Cond: (oid = ANY ('{1,2}'::oid[])) > + -> Sort > + Sort Key: a.oid > + -> Seq Scan on pg_am a > +(7 rows) Are these failures from patches applied to master prior to 3ba2cdaa? David
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Alexander Korotkov
Date:
Hi, David! On Mon, Apr 7, 2025 at 12:15 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Mon, 7 Apr 2025 at 19:39, Melanie Plageman <melanieplageman@gmail.com> wrote: > > +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > > 2025-04-04 14:47:20.358393500 +0000 > > @@ -2769,15 +2769,16 @@ > > EXPLAIN (COSTS OFF) > > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > > WHERE c.oid IN (VALUES (1), (2)); > > - QUERY PLAN > > ---------------------------------------------------------------- > > - Hash Join > > - Hash Cond: (a.oid = c.oid) > > - -> Seq Scan on pg_am a > > - -> Hash > > - -> Index Scan using pg_class_oid_index on pg_class c > > - Index Cond: (oid = ANY ('{1,2}'::oid[])) > > -(6 rows) > > + QUERY PLAN > > +--------------------------------------------------------- > > + Merge Join > > + Merge Cond: (c.oid = a.oid) > > + -> Index Scan using pg_class_oid_index on pg_class c > > + Index Cond: (oid = ANY ('{1,2}'::oid[])) > > + -> Sort > > + Sort Key: a.oid > > + -> Seq Scan on pg_am a > > +(7 rows) > > Are these failures from patches applied to master prior to 3ba2cdaa? Yes, these failures appears before 3ba2cdaa. Tom committed 3ba2cdaa to fix the problem before I get into it [1]. Links. 1. https://www.postgresql.org/message-id/srnuqlttuimzmvoulhsrbgvj4vnul6b65osswvua7sfkqsvmuy%40yg7apybpxp34 ------ Regards, Alexander Korotkov Supabase
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Alexander Korotkov
Date:
Hi, David! On Mon, Apr 7, 2025 at 12:15 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Mon, 7 Apr 2025 at 19:39, Melanie Plageman <melanieplageman@gmail.com> wrote: > > +++ C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/results/subselect.out > > 2025-04-04 14:47:20.358393500 +0000 > > @@ -2769,15 +2769,16 @@ > > EXPLAIN (COSTS OFF) > > SELECT c.oid,c.relname FROM pg_class c JOIN pg_am a USING (oid) > > WHERE c.oid IN (VALUES (1), (2)); > > - QUERY PLAN > > ---------------------------------------------------------------- > > - Hash Join > > - Hash Cond: (a.oid = c.oid) > > - -> Seq Scan on pg_am a > > - -> Hash > > - -> Index Scan using pg_class_oid_index on pg_class c > > - Index Cond: (oid = ANY ('{1,2}'::oid[])) > > -(6 rows) > > + QUERY PLAN > > +--------------------------------------------------------- > > + Merge Join > > + Merge Cond: (c.oid = a.oid) > > + -> Index Scan using pg_class_oid_index on pg_class c > > + Index Cond: (oid = ANY ('{1,2}'::oid[])) > > + -> Sort > > + Sort Key: a.oid > > + -> Seq Scan on pg_am a > > +(7 rows) > > Are these failures from patches applied to master prior to 3ba2cdaa? Yes, these failures appears before 3ba2cdaa. Tom committed 3ba2cdaa to fix the problem before I get into it [1]. Links. 1. https://www.postgresql.org/message-id/srnuqlttuimzmvoulhsrbgvj4vnul6b65osswvua7sfkqsvmuy%40yg7apybpxp34 ------ Regards, Alexander Korotkov Supabase
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Melanie Plageman
Date:
On Mon, Apr 7, 2025 at 5:15 AM David Rowley <dgrowleyml@gmail.com> wrote:
Are these failures from patches applied to master prior to 3ba2cdaa?
Yea, my email was held in moderation for days. I guess cross-posting is flagged. I thought I saw people regularly cc pgsql-hackers when replying to pgsql-committers, but I guess not.
Re: pgsql: Convert 'x IN (VALUES ...)' to 'x = ANY ...' then appropriate
From
Melanie Plageman
Date:
On Mon, Apr 7, 2025 at 5:15 AM David Rowley <dgrowleyml@gmail.com> wrote:
Are these failures from patches applied to master prior to 3ba2cdaa?
Yea, my email was held in moderation for days. I guess cross-posting is flagged. I thought I saw people regularly cc pgsql-hackers when replying to pgsql-committers, but I guess not.