Thread: Transparent table partitioning in future version of PG?

Transparent table partitioning in future version of PG?

From
henk de wit
Date:
Hi,

I was looking at the support that PostgreSQL offers for table partitioning at http://www.postgresql.org/docs/8.4/static/ddl-partitioning.html. The concept looks promising, but its maybe fair to say that PG itself doesn't really supports partitioning natively, but one can simulate it using some of the existing PG features (namely inheritance, triggers, rules and constraint exclusion). This simulating does seem to work, but there are some disadvantages and caveats. 

A major disadvantage is obviously that you need to set up and maintain the whole structure yourself (which is somewhat dangerous, or at least involves a lot of maintenance overhead). Next to that, it seemingly becomes hard to do simple queries likes 'select * from foo where bar> 1000 and bar < 5000', in case the answer to this query spans multiple partitions. constraint exclusion works to some degree, but the document I referred to above tells me I can no longer use prepared statements then.

I wonder if there are any plans to incorporate 'native' or 'transparent' partitioning in some future version of PG? With this I mean that I would basically be able to say something like (pseudo): "alter table foo partition on bar range 100", and PG would then simply start doing internally what we now have to do manually.

Is something like this on the radar or is it just wishful thinking of me?

Kind regards




What can you do with the new Windows Live? Find out

Re: Transparent table partitioning in future version of PG?

From
Robert Haas
Date:
On Fri, May 1, 2009 at 10:32 AM, henk de wit <henk53602@hotmail.com> wrote:
> I was looking at the support that PostgreSQL offers for table partitioning
> at http://www.postgresql.org/docs/8.4/static/ddl-partitioning.html. The
> concept looks promising, but its maybe fair to say that PG itself doesn't
> really supports partitioning natively, but one can simulate it using some of
> the existing PG features (namely inheritance, triggers, rules and constraint
> exclusion). This simulating does seem to work, but there are some
> disadvantages and caveats.
> A major disadvantage is obviously that you need to set up and maintain the
> whole structure yourself (which is somewhat dangerous, or at least involves
> a lot of maintenance overhead). Next to that, it seemingly becomes hard to
> do simple queries likes 'select * from foo where bar> 1000 and bar < 5000',
> in case the answer to this query spans multiple partitions. constraint
> exclusion works to some degree, but the document I referred to above tells
> me I can no longer use prepared statements then.
> I wonder if there are any plans to incorporate 'native' or 'transparent'
> partitioning in some future version of PG? With this I mean that I would
> basically be able to say something like (pseudo): "alter table foo partition
> on bar range 100", and PG would then simply start doing internally what we
> now have to do manually.
> Is something like this on the radar or is it just wishful thinking of me?
> Kind regards

This has been discussed on this list multiple times previously; search
the archives.

The problem has been finding someone who has both the time and the
ability to do the work.

...Robert

Re: Transparent table partitioning in future version of PG?

From
Scott Carey
Date:
On 5/1/09 7:32 AM, "henk de wit" <henk53602@hotmail.com> wrote:

> Hi,
>
> I was looking at the support that PostgreSQL offers for table partitioning
> at http://www.postgresql.org/docs/8.4/static/ddl-partitioning.html. The
> concept looks promising, but its maybe fair to say that PG itself doesn't
> really supports partitioning natively, but one can simulate it using some of
> the existing PG features (namely inheritance, triggers, rules and constraint
> exclusion). This simulating does seem to work, but there are some
> disadvantages and caveats. 
>
> A major disadvantage is obviously that you need to set up and maintain the
> whole structure yourself (which is somewhat dangerous, or at least involves a
> lot of maintenance overhead). Next to that, it seemingly becomes hard to do
> simple queries likes 'select * from foo where bar> 1000 and bar < 5000', in
> case the answer to this query spans multiple partitions. constraint exclusion
> works to some degree, but the document I referred to above tells me I can no
> longer use prepared statements then.

More caveats:

Query plans go bad pretty quickly because the planner doesn't aggregate
statistics correctly when scanning more than one table.

Constraint exclusion code is completely and utterly broken if the table
count gets large on DELETE or UPDATE queries -- I can get the query planner
/ constraint exclusion stuff to eat up 7GB of RAM trying to figure out what
table to access when the number of partitions ~=6000.
The same thing in select form  doesn't consume that memory but still takes
over a second.

This is "not a bug".
http://www.nabble.com/8.3.5:-Query-Planner-takes-15%2B-seconds-to-plan-Updat
e-or-Delete-queries-on-partitioned-tables.-td21992054.html

Its pretty much faster to do merge joins or hash joins client side on
multiple tables -- basically doing partitioning client side -- after a point
and for any more complicated aggregation or join.

There is a lot of talk about overly complicated partitioning or
auto-partitioning, but two much more simple things would go a long way to
making this fairly workable:

Make stat aggregation across tables better -- use weighted average for
estimating row width, aggregate distinct counts and correlations better.
Right now it mostly assumes the worst possible case and can end up with very
unoptimal plans.

Make a special case for "unique" child inheritance constraints that can be
checked much faster -- nobody wants to partition and have overlapping
constraint regions.  And whatever is going on for it on the update / delete
side that causes it to take so much longer and use so much more memory for
what should be the same constraint exclusion check as a select needs to be
attended to.

There would still be manual work for managing creating partitions, but at
this point, that is the _least_ of the problems.


>
> I wonder if there are any plans to incorporate 'native' or 'transparent'
> partitioning in some future version of PG? With this I mean that I would
> basically be able to say something like (pseudo): "alter table foo partition
> on bar range 100", and PG would then simply start doing internally what we now
> have to do manually.
>
> Is something like this on the radar or is it just wishful thinking of me?
>
> Kind regards
>
>
>
>
> What can you do with the new Windows Live? Find out
> <http://www.microsoft.com/windows/windowslive/default.aspx>


Re: Transparent table partitioning in future version of PG?

From
Simon Riggs
Date:
On Fri, 2009-05-01 at 11:27 -0400, Robert Haas wrote:

> The problem has been finding someone who has both the time and the
> ability to do the work.

Unfortunately there has been significant debate over which parts of
partitioning need to be improved. My own view is that considerable
attention needs to be applied to both the executor and planner to
improve matters and that syntax improvements are largely irrelevant,
though seductive.

Deep improvements will require significant analysis, agreement, effort
and skill. What we have now took approximately 20 days to implement,
with later patches adding about another 10-20 days work. I'd estimate
the required work as 60-100 days work from primary author, plus planning
and discussion time. YMMV.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: Transparent table partitioning in future version of PG?

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Fri, 2009-05-01 at 11:27 -0400, Robert Haas wrote:
>> The problem has been finding someone who has both the time and the
>> ability to do the work.

> Unfortunately there has been significant debate over which parts of
> partitioning need to be improved. My own view is that considerable
> attention needs to be applied to both the executor and planner to
> improve matters and that syntax improvements are largely irrelevant,
> though seductive.

My thought about it is that what we really need is an explicit notion
of partitioned tables built into the system, instead of trying to make
the planner re-deduce the partitioning behavior from first principles
every time it builds a plan for such a table.  Such a notion would
presumably involve some new syntax to allow the partitioning rule to be
specified at table creation time.  I agree that the syntax details are a
minor issue, but the set of possible partitioning rules is certainly a
topic of great interest.

            regards, tom lane

Re: Transparent table partitioning in future version of PG?

From
Simon Riggs
Date:
On Wed, 2009-05-06 at 17:55 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Fri, 2009-05-01 at 11:27 -0400, Robert Haas wrote:
> >> The problem has been finding someone who has both the time and the
> >> ability to do the work.
>
> > Unfortunately there has been significant debate over which parts of
> > partitioning need to be improved. My own view is that considerable
> > attention needs to be applied to both the executor and planner to
> > improve matters and that syntax improvements are largely irrelevant,
> > though seductive.
>
> My thought about it is that what we really need is an explicit notion
> of partitioned tables built into the system, instead of trying to make
> the planner re-deduce the partitioning behavior from first principles
> every time it builds a plan for such a table.  Such a notion would
> presumably involve some new syntax to allow the partitioning rule to be
> specified at table creation time.  I agree that the syntax details are a
> minor issue, but the set of possible partitioning rules is certainly a
> topic of great interest.

Agreed. Perhaps I should say then that the syntax needs to express the
requirements of the planner/executor behaviour, rather than being the
main aspect of the feature, as some have suggested.

Hopefully, notions of partitioning won't be directly tied to chunking of
data for parallel query access. Most queries access recent data and
hence only a single partition (or stripe), so partitioning and
parallelism and frequently exactly orthogonal.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: Transparent table partitioning in future version of PG?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I think there should be a way to refer to individual partitions as
> objects.

Yeah, the individual partitions should be nameable tables, otherwise we
will be reinventing a *whole* lot of management stuff to little gain.
I don't actually think there is anything wrong with using table
inheritance as the basic infrastructure --- I just want more smarts
about one particular use pattern of inheritance.

            regards, tom lane

Re: Transparent table partitioning in future version of PG?

From
Alvaro Herrera
Date:
Simon Riggs escribió:


> Hopefully, notions of partitioning won't be directly tied to chunking of
> data for parallel query access. Most queries access recent data and
> hence only a single partition (or stripe), so partitioning and
> parallelism and frequently exactly orthogonal.

I think there should be a way to refer to individual partitions as
objects.  That way we could execute some commands to enable certain
optimizations, for example "mark this partition read only" which would
mean it could be marked as not needing vacuum.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Transparent table partitioning in future version of PG?

From
Craig Ringer
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> I think there should be a way to refer to individual partitions as
>> objects.
>
> Yeah, the individual partitions should be nameable tables, otherwise we
> will be reinventing a *whole* lot of management stuff to little gain.
> I don't actually think there is anything wrong with using table
> inheritance as the basic infrastructure --- I just want more smarts
> about one particular use pattern of inheritance.

Maybe it's worth examining and documenting existing partition setups,
the reasoning behind them, and how they're implemented, in order to
guide any future plans for native partitioning support?

Maybe that's already been/being done. On the off chance that it's not:

Ones I can think of:

- Partitioning an equally active dataset by ranges over a key to improve
 scan performance, INSERT/UPDATE costs on indexes, locking issues, etc.

- The "classic" active/archive partition scheme where there's only one
partition growing at any one time, and the others are historical data
that's nowhere near as "hot".

- A variant on the basic active/archive structure, where query activity
decreases slowly over time and there are many partitions of recent data.
Partitions are merged into larger ones as they age, somewhat like a RRD
database.

I also expect that in the future there will be demand for striping data
across multiple partitions in different tablespaces to exploit
in-parallel scanning (when/if supported) for better I/O utilization in
multiple-disk-array situations. For example, partitioning on
"MOD(id,10)" across 10 separate volumes, and firing off 10 concurrent
scans, one per partition, to satisfy a query.

Those are some simpler schemes. Does anyone actively using partitioning
have particular schemes/details that're worth going into?

--
Craig Ringer

Re: Transparent table partitioning in future version of PG?

From
Simon Riggs
Date:
On Thu, 2009-05-07 at 10:56 +0800, Craig Ringer wrote:
> Tom Lane wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> >> I think there should be a way to refer to individual partitions as
> >> objects.
> >
> > Yeah, the individual partitions should be nameable tables, otherwise we
> > will be reinventing a *whole* lot of management stuff to little gain.
> > I don't actually think there is anything wrong with using table
> > inheritance as the basic infrastructure --- I just want more smarts
> > about one particular use pattern of inheritance.
>
> Maybe it's worth examining and documenting existing partition setups,
> the reasoning behind them, and how they're implemented, in order to
> guide any future plans for native partitioning support?
>
> Maybe that's already been/being done. On the off chance that it's not:
>
> Ones I can think of:
>
> - Partitioning an equally active dataset by ranges over a key to improve
>  scan performance, INSERT/UPDATE costs on indexes, locking issues, etc.
>
> - The "classic" active/archive partition scheme where there's only one
> partition growing at any one time, and the others are historical data
> that's nowhere near as "hot".
>
> - A variant on the basic active/archive structure, where query activity
> decreases slowly over time and there are many partitions of recent data.
> Partitions are merged into larger ones as they age, somewhat like a RRD
> database.
>
> I also expect that in the future there will be demand for striping data
> across multiple partitions in different tablespaces to exploit
> in-parallel scanning (when/if supported) for better I/O utilization in
> multiple-disk-array situations. For example, partitioning on
> "MOD(id,10)" across 10 separate volumes, and firing off 10 concurrent
> scans, one per partition, to satisfy a query.

That's a good summary. It has already been documented and discussed, but
saying it again and again is the best way to get this across.

You've highlighted that partitioning is a feature with many underlying
requirements: infrequent access to data (frequently historical),
striping for parallelism and getting around RDBMS flaws (if any). We
must be careful to implement each requirement in full, yet separately,
so we don't end up with 60% functionality in each case by delivering an
average or least common denominator solution.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: Transparent table partitioning in future version of PG?

From
Scott Carey
Date:


On 5/7/09 1:54 AM, "Simon Riggs" <simon@2ndQuadrant.com> wrote:

>
>
> On Thu, 2009-05-07 at 10:56 +0800, Craig Ringer wrote:
>> Tom Lane wrote:
>>
>> I also expect that in the future there will be demand for striping data
>> across multiple partitions in different tablespaces to exploit
>> in-parallel scanning (when/if supported) for better I/O utilization in
>> multiple-disk-array situations. For example, partitioning on
>> "MOD(id,10)" across 10 separate volumes, and firing off 10 concurrent
>> scans, one per partition, to satisfy a query.
>
> That's a good summary. It has already been documented and discussed, but
> saying it again and again is the best way to get this across.
>
> You've highlighted that partitioning is a feature with many underlying
> requirements: infrequent access to data (frequently historical),

Actually, infrequent access is not a requirement.  It is a common
requirement however.

Take for instance, a very large set of data that contains an integer column
'type_id' that has about 200 distinct values.  The data is accessed with a
strict 'type_id = X' requirement 99.9% of the time.  If this was one large
table, then scans of all sorts become much more expensive than if it is
partitioned on 'type_id'.  Furthermore, partitioning on type_id removes the
requirement to even index on this value.  Statistics on each partition may
vary significantly, and the plannner can thus adapt to changes in the data
per value of type_id naturally.

The raw need is not "infrequent access" but highly partitioned access.  It
doesn't matter if your date-partitioned data is accessed evenly across all
dates or skewed to the most frequent -- it matters that you are almost
always accessing by small date ranges.

> striping for parallelism and getting around RDBMS flaws (if any). We
> must be careful to implement each requirement in full, yet separately,
> so we don't end up with 60% functionality in each case by delivering an
> average or least common denominator solution.
>
> --
>  Simon Riggs           www.2ndQuadrant.com
>  PostgreSQL Training, Services and Support
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: Transparent table partitioning in future version of PG?

From
Robert Haas
Date:
On Wed, May 6, 2009 at 6:08 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Agreed. Perhaps I should say then that the syntax needs to express the
> requirements of the planner/executor behaviour, rather than being the
> main aspect of the feature, as some have suggested.

Agreed.

> Hopefully, notions of partitioning won't be directly tied to chunking of
> data for parallel query access. Most queries access recent data and
> hence only a single partition (or stripe), so partitioning and
> parallelism and frequently exactly orthogonal.

Yes, I think those things are unrelated.

...Robert

Re: Transparent table partitioning in future version of PG?

From
david@lang.hm
Date:
On Thu, 7 May 2009, Robert Haas wrote:

> On Wed, May 6, 2009 at 6:08 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Agreed. Perhaps I should say then that the syntax needs to express the
>> requirements of the planner/executor behaviour, rather than being the
>> main aspect of the feature, as some have suggested.
>
> Agreed.
>
>> Hopefully, notions of partitioning won't be directly tied to chunking of
>> data for parallel query access. Most queries access recent data and
>> hence only a single partition (or stripe), so partitioning and
>> parallelism and frequently exactly orthogonal.
>
> Yes, I think those things are unrelated.

I'm not so sure (warning, I am relativly inexperianced in this area)

it sounds like you can take two basic approaches to partition a database

1. The Isolation Plan

    you want to have it so that your queries match your partitioning.

   this is with the goal of only having to query a small number of
paritions, minimizing the total amount of data touched (including
minimumizing the number of indexes searched)

  this matches the use case mentioned above, with the partition based on
date and only looking at the most recent date range.

2. The Load Balancing Plan

   you want to have your partitioning and your queries _not_ match as much
as possible

   this is with the goal of having the query hit as many partitions as
possible, so that the different parts of the search can happen in parallel


However, with either partitioning plan, you will have queries that
degenerate to look like the other plan.

In the case of the isolation plan, you may need to search for all
instances of a rare thing over the entire history (after all, if you never
need to access that history, why do you pay for disks to store it? ;-)

and even when you are searching a narrow time window, it may still span
multiple partitions. I have a log analysis setup using the Splunk
prioriatary database, it paritions by time, creating a new parition as the
current one hits a configurable size (by default 10G on 64 bit systems).
for my volume of logs I end up with each parition only covering a few
hours. it's very common to want to search over a few days, which can be a
few dozen partitions (this is out of many hundreds of partitions, so it's
still a _huge_ win to narrow the timeframe)


In the case of the load balancing plan, you may run into a query that
happens to only fall into one partition (the query matches your
paritioning logic)




I think the only real difference is how common it is to need to search
multiple partitions.

If the expectation is that you will frequently need to search most/all of
the partitions (the load balancing plan), then it's a waste of time to
analyse the query to try and figure out which paritions you need to look
at.

If the expectation is that you will frequently only need to search a small
number of the partitions (the isolation plan), then it's extremely valuble
to spend as much time as needed working to analyse the query to try and
figure out which partitions you need to look at.


I believe that the isolation plan is probably more common than the load
balancing plan, but I don't see them as being that different for the
database engine point of view. To tune a system that can handle the
isolation plan for load balancing, the key thing to do would be to have a
knob to disable the partition planning, and just blindly send the search
out to every partition.

David Lang

Re: Transparent table partitioning in future version of PG?

From
Robert Haas
Date:
On Thu, May 7, 2009 at 10:52 PM,  <david@lang.hm> wrote:
>>> Hopefully, notions of partitioning won't be directly tied to chunking of
>>> data for parallel query access. Most queries access recent data and
>>> hence only a single partition (or stripe), so partitioning and
>>> parallelism and frequently exactly orthogonal.
>>
>> Yes, I think those things are unrelated.
>
> I'm not so sure (warning, I am relativly inexperianced in this area)
>
> it sounds like you can take two basic approaches to partition a database
>
> 1. The Isolation Plan
[...]
> 2. The Load Balancing Plan

Well, even if the table is not partitioned at all, I don't see that it
should preclude parallel query access.  If I've got a 1 GB table that
needs to be sequentially scanned for rows meeting some restriction
clause, and I have two CPUs and plenty of I/O bandwidth, ISTM it
should be possible to have them each scan half of the table and
combine the results.  Now, this is not easy and there are probably
substantial planner and executor changes required to make it work, but
I don't know that it would be particularly easier if I had two 500 MB
partitions instead of a single 1 GB table.

IOW, I don't think you should need to partition if all you want is
load balancing.  Partitioning should be for isolation, and load
balancing should happen when appropriate, whether there is
partitioning involved or not.

...Robert

Re: Transparent table partitioning in future version of PG?

From
Scott Carey
Date:
On 5/7/09 7:52 PM, "david@lang.hm" <david@lang.hm> wrote:

>
>
> I believe that the isolation plan is probably more common than the load
> balancing plan, but I don't see them as being that different for the
> database engine point of view. To tune a system that can handle the
> isolation plan for load balancing, the key thing to do would be to have a
> knob to disable the partition planning, and just blindly send the search
> out to every partition.

Lots of good points.  However, implicit in the above is that the process of
identifying which partitions contain the data is expensive.
Right now it is (1.5 sec if 6000 partitions with the most simple possible
constraint (column = CONSTANT).

But identifying which partitions can contain a value is really nothing more
than an index.  If you constrain the possible partitioning functions to
those where a single partition key can only exist in one partition, then
this index and its look up should be very fast even for large partition
counts.  From what I can tell empirically, the current system does this in
more of a sequential scan, running the constraint checks for each
possibility.
Furthremore, the actual tables don't have to contain the data if the key is
a column identity function (date = X ) rather than a range or hash.

At the core, partitioning is really just a form of 'chunky' indexing that
doesn't fragment, or need re-indexing, or have much MVCC complexity.

>
> David Lang
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: Transparent table partitioning in future version of PG?

From
david@lang.hm
Date:
On Fri, 8 May 2009, Robert Haas wrote:

> On Thu, May 7, 2009 at 10:52 PM,  <david@lang.hm> wrote:
>>>> Hopefully, notions of partitioning won't be directly tied to chunking of
>>>> data for parallel query access. Most queries access recent data and
>>>> hence only a single partition (or stripe), so partitioning and
>>>> parallelism and frequently exactly orthogonal.
>>>
>>> Yes, I think those things are unrelated.
>>
>> I'm not so sure (warning, I am relativly inexperianced in this area)
>>
>> it sounds like you can take two basic approaches to partition a database
>>
>> 1. The Isolation Plan
> [...]
>> 2. The Load Balancing Plan
>
> Well, even if the table is not partitioned at all, I don't see that it
> should preclude parallel query access.  If I've got a 1 GB table that
> needs to be sequentially scanned for rows meeting some restriction
> clause, and I have two CPUs and plenty of I/O bandwidth, ISTM it
> should be possible to have them each scan half of the table and
> combine the results.  Now, this is not easy and there are probably
> substantial planner and executor changes required to make it work, but
> I don't know that it would be particularly easier if I had two 500 MB
> partitions instead of a single 1 GB table.
>
> IOW, I don't think you should need to partition if all you want is
> load balancing.  Partitioning should be for isolation, and load
> balancing should happen when appropriate, whether there is
> partitioning involved or not.

actually, I will contridict myself slightly.

with the Isolation Plan there is not nessasarily a need to run the query
on each parition in parallel.

   if parallel queries are possible, it will benifit Isolation Plan
paritioning, but the biggest win with this plan is just reducing the
number of paritions that need to be queried.

with the Load Balancing Plan there is no benifit in partitioning unless
you have the ability to run queries on each parition in parallel


using a seperate back-end process to do a query on a seperate partition is
a fairly straightforward, but not trivial thing to do (there are
complications in merging the result sets, including the need to be able to
do part of a query, merge the results, then use those results for the next
step in the query)

   I would also note that there does not seem to be a huge conceptual
difference between doing these parallel queries on one computer and
shipping the queries off to other computers.


however, trying to split the work on a single table runs into all sorts of
'interesting' issues with things needing to be shared between the multiple
processes (they both need to use the same indexes, for example)

so I think that it is much easier for the database engine to efficiantly
search two 500G tables instead of one 1T table.

David Lang

Re: Transparent table partitioning in future version of PG?

From
Scott Carey
Date:
On 5/8/09 11:20 AM, "david@lang.hm" <david@lang.hm> wrote:
>
> with the Load Balancing Plan there is no benifit in partitioning unless
> you have the ability to run queries on each parition in parallel
>

I think there is a benefit to partitioning in this case.  If the statistics
on other columns are highly skewed WRT the column(s) partitioned, the
planner statistics will be better.  It may have to access every partition,
but it doesn't have to access every partition in the same way.

Perhaps something like:  user_id = 'FOO' is one of the most common vals in
date partition A, and one of the  least common vals in B, so a where clause
with user_id = 'FOO' will sequential scan one and index scan another.

For really large tables with data correlation that varies significantly,
this can be a huge performance gain even if all partitions are accessed.


Re: Transparent table partitioning in future version of PG?

From
Robert Haas
Date:
>> IOW, I don't think you should need to partition if all you want is
>> load balancing.  Partitioning should be for isolation, and load
>> balancing should happen when appropriate, whether there is
>> partitioning involved or not.
>
> actually, I will contridict myself slightly.
>
[...]
> however, trying to split the work on a single table runs into all sorts of
> 'interesting' issues with things needing to be shared between the multiple
> processes (they both need to use the same indexes, for example)

I disagree with this part of your email.  It is already the case that
tables and indexes need to support concurrent access by multiple
Postgres processes.  I don't see why that part of the problem would be
any more difficult for parallel query execution than it would be for
executing two different and unrelated queries on the same table.

> so I think that it is much easier for the database engine to efficiantly
> search two 500G tables instead of one 1T table.

And that leads me to the opposite conclusion on this point.

...Robert

Re: Transparent table partitioning in future version of PG?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> so I think that it is much easier for the database engine to efficiantly
>> search two 500G tables instead of one 1T table.

> And that leads me to the opposite conclusion on this point.

I don't think there would be any difference on that score, either.

            regards, tom lane

Re: Transparent table partitioning in future version of PG?

From
Craig Ringer
Date:
Robert Haas wrote:

> Well, even if the table is not partitioned at all, I don't see that it
> should preclude parallel query access.  If I've got a 1 GB table that
> needs to be sequentially scanned for rows meeting some restriction
> clause, and I have two CPUs and plenty of I/O bandwidth, ISTM it
> should be possible to have them each scan half of the table and
> combine the results.  Now, this is not easy and there are probably
> substantial planner and executor changes required to make it work, but
> I don't know that it would be particularly easier if I had two 500 MB
> partitions instead of a single 1 GB table.

The point of partitioning in this scenario is primarily that you can put
the different partitions in different tablespaces, most likely on
independent disk devices. You therefore get more I/O bandwidth.

--
Craig Ringer