Re: Hash partitioning. - Mailing list pgsql-hackers

From Yuri Levinsky
Subject Re: Hash partitioning.
Date
Msg-id B72526FA2066E344AFD09734A487318103E92BC9@falcon1.celltick.com
Whole thread Raw
In response to Re: Hash partitioning.  ("ktm@rice.edu" <ktm@rice.edu>)
Responses Re: Hash partitioning.
List pgsql-hackers
Markus,
It's no relation between partitions and raids despite they both
distribute data somehow. By the end of the day when you use the raid you
have one single device with some performance limitations. When you want
to improve your data access after that and not to work with huge indexes
that you unable to maintain or you don't want to use index like in case
of range partition by time or hash partition: you welcome to use
partitions. You typically don't want to use b-tree index when yo select
more when ~1-2% of your data.

Sincerely yours,


Yuri Levinsky, DBA
Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel
Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222


-----Original Message-----
From: ktm@rice.edu [mailto:ktm@rice.edu]
Sent: Wednesday, June 26, 2013 5:01 PM
To: Markus Wanner
Cc: Kevin Grittner; Claudio Freire; Robert Haas; Bruce Momjian; Yuri
Levinsky; PostgreSQL-Dev
Subject: Re: [HACKERS] Hash partitioning.

On Wed, Jun 26, 2013 at 03:47:43PM +0200, Markus Wanner wrote:
> On 06/25/2013 11:52 PM, Kevin Grittner wrote:
> > At least until we have parallel
> > query execution.  At *that* point this all changes.
>
> Can you elaborate on that, please? I currently have a hard time
> imagining how partitions can help performance in that case, either. At

> least compared to modern RAID and read-ahead capabilities.
>
> After all, RAID can be thought of as hash partitioning with a very
> weird hash function. Or maybe rather range partitioning on an internal
key.
>
> Put another way: ideally, the system should take care of optimally
> distributing data across its physical storage itself. If you need to
> do partitioning manually for performance reasons, that's actually a
> deficiency of it, not a feature.
>
> I certainly agree that manageability may be a perfectly valid reason
> to partition your data. Maybe there even exist other good reasons. I
> don't think performance optimization is one. (It's more like giving
> the system a hint. And we all dislike hints, don't we? *ducks*)
>
> Regards
>
> Markus Wanner
>

Hi Markus,

I think he is referring to the fact that with parallel query execution,
multiple partitions can be processed simultaneously instead of serially
as they are now with the resulting speed increase.

Regards,
Ken

This mail was received via Mail-SeCure System.





pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Hash partitioning.
Next
From: Bruce Momjian
Date:
Subject: Re: Hash partitioning.