Re: Hash partitioning. - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Hash partitioning.
Date
Msg-id CAMkU=1xgu2T7VK-Poows+9vMr+nTxnwFKBYQ2LtgWCv=rm6KPw@mail.gmail.com
Whole thread Raw
In response to Re: Hash partitioning.  ("ktm@rice.edu" <ktm@rice.edu>)
List pgsql-hackers
On Wed, Jun 26, 2013 at 7:01 AM, ktm@rice.edu <ktm@rice.edu> wrote:
On Wed, Jun 26, 2013 at 03:47:43PM +0200, Markus Wanner wrote:
> On 06/25/2013 11:52 PM, Kevin Grittner wrote:
> > At least until we have parallel
> > query execution.  At *that* point this all changes.
>
> Can you elaborate on that, please? I currently have a hard time
> imagining how partitions can help performance in that case, either. At
> least compared to modern RAID and read-ahead capabilities.
>
> After all, RAID can be thought of as hash partitioning with a very weird
> hash function. Or maybe rather range partitioning on an internal key.
>
> Put another way: ideally, the system should take care of optimally
> distributing data across its physical storage itself. If you need to do
> partitioning manually for performance reasons, that's actually a
> deficiency of it, not a feature.

+1, except I'm looking at it from a CPU perspective not a disk perspective.

I would hope not to need to partition my data at all in order to enable parallel execution.  I certainly would hope not to redo that partitioning just because I got new hardware with a different number of CPUs.


Hi Markus,

I think he is referring to the fact that with parallel query execution,
multiple partitions can be processed simultaneously instead of serially
as they are now with the resulting speed increase.


Hopefully parallel execution can divide the query into multiple "chunks" on its own, without me needing to micromanage it.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Add more regression tests for dbcommands
Next
From: Szymon Guz
Date:
Subject: Re: Add more regression tests for CREATE OPERATOR