Home > mailing lists

Allow parallel DISTINCT - Mailing list pgsql-hackers

From	David Rowley
Subject	Allow parallel DISTINCT
Date	August 11, 2021 04:51:02
Msg-id	CAApHDvrjRxVKwQN0he79xS+9wyotFXL=RmoWqGGO2N45Farpgw@mail.gmail.com Whole thread Raw
Responses	Re: Allow parallel DISTINCT
List	pgsql-hackers

Tree view

Back in March 2016, e06a38965 added support for parallel aggregation.
IIRC, because it was fairly late in the release cycle, I dropped
parallel DISTINCT to reduce the scope a little. It's been on my list
of things to fix since then. I just didn't get around to it until
today.

The patch is just some plumbing work to connect all the correct paths
up to make it work. It's all fairly trivial.

I thought about refactoring things a bit more to get rid of the
additional calls to grouping_is_sortable() and grouping_is_hashable(),
but I just don't think it's worth making the code ugly for.  We'll
only call them again if we're considering a parallel plan, in which
case it's most likely not a trivial query.  Those functions are pretty
cheap anyway.

I understand that there's another patch in the September commitfest
that does some stuff with Parallel DISTINCT, but that goes about
things a completely different way by creating multiple queues to
distribute values by hash.  I don't think there's any overlap here.
We'd likely want to still have the planner consider both methods if we
get that patch sometime.

David

Attachment

parallel_distinct.patch

pgsql-hackers by date:

From: "David G. Johnston"
Date: 11 August 2021, 04:29:49
Subject: Re: use-regular-expressions-to-simplify-less_greater-and-not_equals.patch

From: Fujii Masao
Date: 11 August 2021, 04:56:27
Subject: Re: Fix around conn_duration in pgbench

Allow parallel DISTINCT - Mailing list pgsql-hackers

Attachment

Previous

Next