Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant
Date
Msg-id CAMbWs4-Lxt1=ENuNW6V3hvwZUR=JbUDR+f2omfeASRiRuRattQ@mail.gmail.com
Whole thread Raw
In response to Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers

On Wed, Oct 26, 2022 at 4:25 PM David Rowley <dgrowleyml@gmail.com> wrote:
One other thought I had about the duplicate "Limit" node in the final
plan was that we could make the limit clause an Expr like
LEAST(<existing limit clause>, 1).  That way we could ensure we get at
most 1 row, but perhaps less if the expression given in the LIMIT
clause evaluated to 0. This will still work correctly when the
existing limit evaluates to NULL. I'm still just not that keen on this
idea as it means still having to either edit the parse's limitCount or
store the limit details in a new field in PlannerInfo and use that
when making the final LimitPath. However, I'm still not sure doing
this is worth the extra complexity.
 
I find the duplicate "Limit" node is not that concerning after I realize
it may appear in other queries, such as

explain (analyze, timing off, costs off)
select * from (select * from (select * from generate_series(1,100)i limit 10) limit 5) limit 1;
                                  QUERY PLAN
------------------------------------------------------------------------------
 Limit (actual rows=1 loops=1)
   ->  Limit (actual rows=1 loops=1)
         ->  Limit (actual rows=1 loops=1)
               ->  Function Scan on generate_series i (actual rows=1 loops=1)

Although the situation is different in that the Limit node is actually
atop SubqueryScan which is removed afterwards, but the final plan
appears as a Limit node atop another Limit node.

So I wonder maybe we can just live with it, or resolve it in a separate
patch.

Thanks
Richard

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: GUC values - recommended way to declare the C variables?
Next
From: Michael Paquier
Date:
Subject: Re: GUC values - recommended way to declare the C variables?