Home > mailing lists

Re: A new strategy for pull-up correlated ANY_SUBLINK - Mailing list pgsql-hackers

From	vignesh C
Subject	Re: A new strategy for pull-up correlated ANY_SUBLINK
Date	January 26, 2024 12:46:02
Msg-id	CALDaNm2ck2ykEpF+7LzT6c1GeXLn-YgDBwUh4VmQiZmeHEUetQ@mail.gmail.com Whole thread Raw
In response to	Re: A new strategy for pull-up correlated ANY_SUBLINK (Alena Rybakina <lena.ribackina@yandex.ru>)
Responses	Re: A new strategy for pull-up correlated ANY_SUBLINK
List	pgsql-hackers

Tree view

On Fri, 13 Oct 2023 at 14:09, Alena Rybakina <lena.ribackina@yandex.ru> wrote:
>
> On 13.10.2023 10:04, Andy Fan wrote:
>>
>> It seems to me that the expressions "=" and "IN" are equivalent here due to the fact that the aggregated subquery
returnsonly one value, and the result with the "IN" operation can be considered as the intersection of elements on the
leftand right. In this query, we have some kind of set on the left, among which there will be found or not only one
elementon the right. 
>
>
> Yes, they are equivalent at the final result, but there are some
> differences at the execution level.  the '=' case will be transformed
> to a Subplan whose subPlanType is EXPR_SUBLINK, so if there
> is more than 1 rows is returned in the subplan, error will be raised.
>
> select * from tenk1 where
>   ten =  (select ten from tenk1 i where i.two = tenk1.two );
>
> ERROR:  more than one row returned by a subquery used as an expression
>
> However the IN case would not.
> select * from tenk1 where
>   ten =  (select ten from tenk1 i where i.two = tenk1.two ) is OK.
>
>
> I think the test case you added is not related to this feature. the
> difference is there even without the patch.  so I kept the code
> you changed, but not for the test  case.
>
> Yes, I understand and agree with you that we should delete the last queries, except to one.
>
> The query below have a different result compared to master, and it is correct.
>
>
> Without your patch:
>
> explain (costs off)
> +SELECT * FROM tenk1 A LEFT JOIN tenk2 B
> ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
>                                  QUERY PLAN
> -----------------------------------------------------------------------------
>  Nested Loop Left Join
>    ->  Seq Scan on tenk1 a
>    ->  Materialize
>          ->  Seq Scan on tenk2 b
>                Filter: (SubPlan 2)
>                SubPlan 2
>                  ->  Result
>                        InitPlan 1 (returns $1)
>                          ->  Limit
>                                ->  Index Scan using tenk2_hundred on tenk2 c
>                                      Index Cond: (hundred IS NOT NULL)
>                                      Filter: (odd = b.odd)
> (12 rows)
>
>
> After your patch:
>
> postgres=# explain (costs off)
> SELECT * FROM tenk1 A LEFT JOIN tenk2 B
> ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
>
>                            QUERY PLAN
> --------------------------------------------------------------
>   Nested Loop Left Join
>     ->  Seq Scan on tenk1 a
>     ->  Materialize
>           ->  Nested Loop
>                 ->  Seq Scan on tenk2 b
>                 ->  Subquery Scan on "ANY_subquery"
>                       Filter: (b.hundred = "ANY_subquery".min)
>                       ->  Aggregate
>                             ->  Seq Scan on tenk2 c
>                                   Filter: (odd = b.odd)
> (10 rows)
>
>
>>> I took the liberty of adding this to your patch and added myself as reviewer, if you don't mind.
>>
>> Sure, the patch after your modification looks better than the original.
>> I'm not sure how the test case around "because of got one row" is
>> relevant to the current changes.  After we reach to some agreement
>> on the above discussion, I think v4 is good for committer to review!
>>
>>
>> Thank you!) I am ready to discuss it.
>
>
> Actually I meant to discuss the "Unfortunately, I found a request..", looks
> we have reached an agreement there:)
>
> Yes, we have)

Hi Andy Fan,

If the changes of Alena are ok, can you merge the changes and post an
updated version so that CFBot can apply the patch and verify the
changes. As currently CFBot is trying to apply only Alena's changes
and failing with the following at [1]:
=== Applying patches on top of PostgreSQL commit ID
fba2112b1569fd001a9e54dfdd73fd3cb8f16140 ===
=== applying patch ./pull-up.diff
patching file src/test/regress/expected/subselect.out
Hunk #1 succeeded at 1926 with fuzz 2 (offset -102 lines).
patching file src/test/regress/sql/subselect.sql
Hunk #1 FAILED at 1000.
1 out of 1 hunk FAILED -- saving rejects to file
src/test/regress/sql/subselect.sql.rej

[1] - http://cfbot.cputube.org/patch_46_4268.log

Regards,
Vignesh

pgsql-hackers by date:

From: Nikita Malakhov
Date: 26 January 2024, 12:42:50
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing

From: vignesh C
Date: 26 January 2024, 12:50:07
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

Re: A new strategy for pull-up correlated ANY_SUBLINK - Mailing list pgsql-hackers

Previous

Next