Re: Replace IN VALUES with ANY in WHERE clauses during optimization - Mailing list pgsql-hackers
From | Alena Rybakina |
---|---|
Subject | Re: Replace IN VALUES with ANY in WHERE clauses during optimization |
Date | |
Msg-id | 843f8d10-0a84-478b-b14e-9e8e29024634@postgrespro.ru Whole thread Raw |
Responses |
Re: Replace IN VALUES with ANY in WHERE clauses during optimization
Re: Replace IN VALUES with ANY in WHERE clauses during optimization |
List | pgsql-hackers |
Hi! On 03.10.2024 22:52, Ivan Kush wrote: > > Hello, hackers! I with my friends propose the patch to replace IN > VALUES to ANY in WHERE clauses. > > # Intro > > The `VALUES` in the `IN VALUES` construct is replaced with with an > array of values when `VALUES` contains 1 column. In the end it will be > replaced with ANY by the existing function makeA_Expr > (src/backend/nodes/makefuncs.c) > > This improves performance, especially if the values are small. > > # Patch > > v1-in_values_to_array.patch > > # How realized > > `VALUES` statement corresponds to `values_clause` nonterminal symbol > in gram.y, where it's parsed to `SelectStmt` node. > > `IN` is parsed in `a_expr` symbol. When it contains `VALUES` with 1 > column, parser extracts data from `SelectStmt` and passes it > > to function call `makeSimpleA_Expr` where simple `A_Expr` is created. > > Later during optimizations of parser tree this `A_Expr` will be > transformed to `ArrayExpr` (already realized in Postgres) > > > # Authors. > Author: Ivan Kush <ivan.kush@tantorlabs.com> > Author: Vadim Yacenko <vadim.yacenko@tantorlabs.com> > Author: Alexander Simonov <alexander.simonov@tantorlabs.com> > > # Tests > Implementation contains many regression tests of varying complexity, > which check supported features. > > # Platform > This patch was checkouted from tag REL_17_STABLE. Code is developed in > Linux, doesn't contain platfrom-specific code, only Postgres internal > data structures and functions. > > # Documentation > Regression tests contain many examples > > # Performance > It increases performance > > # Example > Let's compare result. With path the execution time is significantly > lower. > > We have a table table1 with 10000 rows. > > postgres=# \d table1; > Table "public.table1" > Column | Type | Collation | Nullable | Default > --------+-----------------------------+-----------+----------+--------- > fld1 | timestamp without time zone | | not null | > fld2 | bytea | | not null | > Indexes: > "table1index" btree (fld2) > > Let's execute several commands > see commands.sql > > Plan no patch > see plan_no_patch.txt > > > Plan with patch > see plan_with_patch.txt I think you should think about putting these constants in ANY Array EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) select * from t where x in (VALUES(1200), (1)); QUERY PLAN --------------------------------------------------- Seq Scan on t (actual rows=1 loops=1) Filter: (x = ANY ('{1200,1}'::integer[])) (3 rows) Anlrey Lepikhov and I recently described this in an article [0] here and the implementation already exists, but for now it was posted a binary application for testing. The acceleration is significant I agree. [0] https://danolivo.substack.com/p/7456653e-9716-4e91-ad09-83737784c665 -- Regards, Alena Rybakina Postgres Professional
pgsql-hackers by date: