Home > mailing lists

Re: Parallel Sort - Mailing list pgsql-hackers

From	Hitoshi Harada
Subject	Re: Parallel Sort
Date	May 19, 2013 22:06:47
Msg-id	CAP7QgmnZHZvvbb4hDTJ5-95Zn73TAy6RLXf6VdsiM3TAiZMhcw@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Sort (Noah Misch <noah@leadboat.com>)
List	pgsql-hackers

Tree view

On Wed, May 15, 2013 at 11:11 AM, Noah Misch <noah@leadboat.com> wrote:

On Wed, May 15, 2013 at 08:12:34AM +0900, Michael Paquier wrote:
> The concept of clause parallelism for backend worker is close to the
> concept of clause shippability introduced in Postgres-XC. In the case of
> XC, the equivalent of the master backend is a backend located on a node
> called Coordinator that merges and organizes results fetched in parallel
> from remote nodes where data scans occur (on nodes called Datanodes). The
> backends used for tuple scans across Datanodes share the same data
> visibility as they use the same snapshot and transaction ID as the backend
> on Coordinator. This is different from the parallelism as there is no idea
> of snapshot import to worker backends.

Worker backends would indeed share snapshot and XID.

> However, the code in XC planner used for clause shippability evaluation is
> definitely worth looking at just considering the many similarities it
> shares with parallelism when evaluating if a given clause can be executed
> on a worker backend or not. It would be a waste to implement twice the same
> thing is there is code already available.

Agreed. Local parallel query is very similar to distributed query; the
specific IPC cost multipliers differ, but that's about it. I hope we can
benefit from XC's experience in this area.

I believe the parallel execution is much easier to be done if the data is partitioned. Of course it is possible to make only the sort operation parallel but then the question would be how to split and pass each tuple to workers. XC and Greenplum use notion of hash distributed table that enables the parallel sort (XC doesn't perform parallel sort on replicated table, I guess). For postgres, I don't think hash distributed table is foreseeable option, but MergeAppend over inheritance is a good choice to run in parallel. You won't even need to modify many lines of sort execution code if you correctly dispatch the work, as it's just to split and assign the subnode of query plan to workers. Transactions and locks will be tricky though, and we might end up introducing small set of snapshot sharing infra for the former and notion of session id rather than process id for the latter. I don't think SnapshotNow is the problem as anyway executor is reading catalogs with that today.

Thanks,
Hitoshi

--
Hitoshi Harada

pgsql-hackers by date:

From: Simon Riggs
Date: 19 May 2013, 19:35:31
Subject: Re: Assertion failure when promoting node by deleting recovery.conf and restart node

From: Jeff Janes
Date: 19 May 2013, 23:48:33
Subject: psql \watch versus \timing

Re: Parallel Sort - Mailing list pgsql-hackers

Previous

Next