Home > mailing lists

Re: Hash partitioning. - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Re: Hash partitioning.
Date	June 28, 2013 00:13:08
Msg-id	CAMkU=1wsR3DvvsxmAsoiQAHGnW+_UFETcQsig-MP6JL9cK098Q@mail.gmail.com Whole thread Raw
In response to	Re: Hash partitioning. (Markus Wanner <markus@bluegap.ch>)
Responses	Re: Hash partitioning.
List	pgsql-hackers

Tree view

On Wed, Jun 26, 2013 at 8:55 AM, Markus Wanner <markus@bluegap.ch> wrote:

On 06/26/2013 05:46 PM, Heikki Linnakangas wrote:
> We could also allow a large query to search a single table in parallel.
> A seqscan would be easy to divide into N equally-sized parts that can be
> scanned in parallel. It's more difficult for index scans, but even then
> it might be possible at least in some limited cases.

So far reading sequentially is still faster than hopping between
different locations. Purely from the I/O perspective, that is.

Wouldn't any IO system being used on a high-end system be fairly good about making this work through interleaved read-ahead algorithms? Also, hopefully the planner would be able to predict when parallelization has nothing to add and avoid using it, although surely that is easier said than done.

For queries where the single CPU core turns into a bottle-neck and which
we want to parallelize, we should ideally still do a normal, fully
sequential scan and only fan out after the scan and distribute the
incoming pages (or even tuples) to the multiple cores to process.

That sounds like it would be much more susceptible to lock contention, and harder to get bug-free, than dividing into bigger chunks, like whole 1 gig segments.

Fanning out line by line (according to line_number % number_processes) was my favorite parallelization method in Perl, but those files were read only and so had no concurrency issues.

Cheers,

Jeff

pgsql-hackers by date:

From: Josh Berkus
Date: 28 June 2013, 00:12:46
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

From: Jeff Janes
Date: 28 June 2013, 00:20:55
Subject: Re: Hash partitioning.

Re: Hash partitioning. - Mailing list pgsql-hackers

Previous

Next