Thread: pgsql: Allow parallel DISTINCT
Allow parallel DISTINCT We've supported parallel aggregation since e06a38965. At the time, we didn't quite get around to also adding parallel DISTINCT. So, let's do that now. This is implemented by introducing a two-phase DISTINCT. Phase 1 is performed on parallel workers, rows are made distinct there either by hashing or by sort/unique. The results from the parallel workers are combined and the final distinct phase is performed serially to get rid of any duplicate rows that appear due to combining rows for each of the parallel workers. Author: David Rowley Reviewed-by: Zhihong Yu Discussion: https://postgr.es/m/CAApHDvrjRxVKwQN0he79xS+9wyotFXL=RmoWqGGO2N45Farpgw@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/22c4e88ebff408acd52e212543a77158bde59e69 Modified Files -------------- src/backend/optimizer/README | 1 + src/backend/optimizer/plan/planner.c | 219 ++++++++++++++++++++++---- src/include/nodes/pathnodes.h | 1 + src/test/regress/expected/select_distinct.out | 67 ++++++++ src/test/regress/sql/select_distinct.sql | 37 +++++ 5 files changed, 292 insertions(+), 33 deletions(-)