Thread: Apply the "LIMIT 1" optimization to partial DISTINCT
In 5543677ec9 we introduced an optimization that uses Limit instead of
Unique to implement DISTINCT when all the DISTINCT pathkeys have been
marked as redundant. I happened to notice that this optimization was
not applied to partial DISTINCT, which I think should be. This can
improve plans in some cases, such as
-- on master
explain (costs off) select distinct four from tenk1 where four = 4;
QUERY PLAN
----------------------------------------------
Limit
-> Gather
Workers Planned: 4
-> Unique
-> Parallel Seq Scan on tenk1
Filter: (four = 4)
(6 rows)
-- patched
explain (costs off) select distinct four from tenk1 where four = 4;
QUERY PLAN
----------------------------------------------
Limit
-> Gather
Workers Planned: 4
-> Limit
-> Parallel Seq Scan on tenk1
Filter: (four = 4)
(6 rows)
Such queries might not be that common, but it's very cheap to apply this
optimization.
Attached is a patch for that.
Thanks
Richard
Unique to implement DISTINCT when all the DISTINCT pathkeys have been
marked as redundant. I happened to notice that this optimization was
not applied to partial DISTINCT, which I think should be. This can
improve plans in some cases, such as
-- on master
explain (costs off) select distinct four from tenk1 where four = 4;
QUERY PLAN
----------------------------------------------
Limit
-> Gather
Workers Planned: 4
-> Unique
-> Parallel Seq Scan on tenk1
Filter: (four = 4)
(6 rows)
-- patched
explain (costs off) select distinct four from tenk1 where four = 4;
QUERY PLAN
----------------------------------------------
Limit
-> Gather
Workers Planned: 4
-> Limit
-> Parallel Seq Scan on tenk1
Filter: (four = 4)
(6 rows)
Such queries might not be that common, but it's very cheap to apply this
optimization.
Attached is a patch for that.
Thanks
Richard
Attachment
On Fri, 26 Jan 2024 at 20:42, Richard Guo <guofenglinux@gmail.com> wrote: > > In 5543677ec9 we introduced an optimization that uses Limit instead of > Unique to implement DISTINCT when all the DISTINCT pathkeys have been > marked as redundant. I happened to notice that this optimization was > not applied to partial DISTINCT, which I think should be. It seems very likely that the parallel plan would only be chosen if the planner estimated there'd just be 1 row before the distinct. Otherwise, the non-partial path's LIMIT would come out so cheap that it would be unlikely that the parallel plan would be picked. I think your test case only chooses the parallel plan because you're doing FROM tenk1 WHERE four=4. And that column only contains values 0..3. However, having said that. Parallel plans are often picked when there is some highly selective qual as parallel_tuple_cost has to be applied to fewer tuples for such plans, so probably this is worth doing. David
On Fri, 26 Jan 2024 at 21:14, David Rowley <dgrowleyml@gmail.com> wrote: > However, having said that. Parallel plans are often picked when there > is some highly selective qual as parallel_tuple_cost has to be applied > to fewer tuples for such plans, so probably this is worth doing. I was messing around with your test case and didn't manage to get any plan that had any rows to use the partial path with the LIMIT. I ended up dropping the test that was checking the results were empty as I didn't think it added much more value over the EXPLAIN output. I pushed the result. Thanks for working on this. David
On Wed, Jan 31, 2024 at 12:26 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 26 Jan 2024 at 21:14, David Rowley <dgrowleyml@gmail.com> wrote:
> However, having said that. Parallel plans are often picked when there
> is some highly selective qual as parallel_tuple_cost has to be applied
> to fewer tuples for such plans, so probably this is worth doing.
I was messing around with your test case and didn't manage to get any
plan that had any rows to use the partial path with the LIMIT. I
ended up dropping the test that was checking the results were empty as
I didn't think it added much more value over the EXPLAIN output.
I pushed the result.
Thanks for pushing it!
Thanks
Richard
Thanks
Richard