Thread: partitionning

partitionning

From

"FERREIRA William (COFRAMI)"

Date:

09 March 2005, 16:31:03

does the table partitionning exists under PostgreSQL ? or maybe an alternative exists ?

Oracle implements this system : it allows to spread rows on differents partitions depending on an attribute.

For example, my application store xml documents in database, an the partitionning is used to spread the differents documents on differents partitions.

thanks

Re: partitionning

From

Scott Marlowe

Date:

09 March 2005, 16:42:57

On Wed, 2005-03-09 at 10:31, FERREIRA William (COFRAMI) wrote:
> hi
>
> does the table partitionning exists under PostgreSQL ? or maybe an
> alternative exists ?
>
> Oracle implements this system : it allows to spread rows on differents
> partitions depending on an attribute.
> For example, my application store xml documents in database, an the
> partitionning is used to spread the differents documents on differents
> partitions.

Currently, doing this in PostgreSQL is more of a "roll your own" thing.
you create the base tables on different table spaces, then create an
updatable view with the proper triggers to make sure that the incoming
records go to the right tables.

With the advent of very large raid arrays with very fast caching
controllers, this methodology is becoming less and less necessary.

Re: partitionning

From

Thomas F.O'Connell

Date:

09 March 2005, 17:00:08

Tablespaces are the closest thing. They were introduced in 8.0:

http://www.postgresql.org/docs/8.0/static/sql-createtablespace.html

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Mar 9, 2005, at 10:31 AM, FERREIRA William (COFRAMI) wrote:

> hi
>  
> does the table partitionning exists under PostgreSQL ? or maybe an
> alternative exists ?
>  
> Oracle implements this system : it allows to spread rows on differents
> partitions depending on an attribute.
> For example, my application store xml documents in database, an the
> partitionning is used to spread the differents documents on differents
> partitions.
>  
> thanks

Re: partitionning

From

Greg Stark

Date:

09 March 2005, 19:07:26

Scott Marlowe <smarlowe@g2switchworks.com> writes:

> With the advent of very large raid arrays with very fast caching
> controllers, this methodology is becoming less and less necessary.

I think the evidence is to the contrary. Witness the rather dramatic surge in
inquiries about this on this list. A year ago there were only two or three of
us pining for this feature. Now it's a weekly refrain.

Very large very fast raid arrays just mean that people want to gather that
much more data. They would still like to make the best use of their hardware
and not waste their resources on tremendously inefficient purging and loading
procedures when it would be possible to do these things instantaneously. That
only becomes more important as the investment they want to leverage becomes
larger.

--
greg

Re: partitionning

From

"Joshua D. Drake"

Date:

09 March 2005, 19:20:13

Scott Marlowe wrote:

>On Wed, 2005-03-09 at 10:31, FERREIRA William (COFRAMI) wrote:
>
>
>>hi
>>
>>does the table partitionning exists under PostgreSQL ? or maybe an
>>alternative exists ?
>>
>>Oracle implements this system : it allows to spread rows on differents
>>partitions depending on an attribute.
>>For example, my application store xml documents in database, an the
>>partitionning is used to spread the differents documents on differents
>>partitions.
>>
>>
>
>Currently, doing this in PostgreSQL is more of a "roll your own" thing.
>you create the base tables on different table spaces, then create an
>updatable view with the proper triggers to make sure that the incoming
>records go to the right tables.
>
>With the advent of very large raid arrays with very fast caching
>controllers, this methodology is becoming less and less necessary.
>
>
I don't agree with this. There are many postgresql users
who have 100, 200, 600 GB databases that don't have the budget
to purchase a 20,000 array.

Table partitioning is a way to keep things efficient. That
should be regardless of technology.

RAM is cheap, so lets just use as much of it as we can even
if it means we swap.

Sincerely,

Joshua D. Drake



>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

Attachment

jd.vcf

Re: partitionning

From

Scott Marlowe

Date:

09 March 2005, 19:22:23

On Wed, 2005-03-09 at 13:07, Greg Stark wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
>
> > With the advent of very large raid arrays with very fast caching
> > controllers, this methodology is becoming less and less necessary.
>
> I think the evidence is to the contrary. Witness the rather dramatic surge in
> inquiries about this on this list. A year ago there were only two or three of
> us pining for this feature. Now it's a weekly refrain.
>
> Very large very fast raid arrays just mean that people want to gather that
> much more data. They would still like to make the best use of their hardware
> and not waste their resources on tremendously inefficient purging and loading
> procedures when it would be possible to do these things instantaneously. That
> only becomes more important as the investment they want to leverage becomes
> larger.

Actually, I think it is the more common scenario of migrating off of
oracle or db2 and onto postgresql, and bringing along the experience
gained there over the years that has caused this refrain to sprout up
more and more often.  With a database sitting on an enterpries class
storage subsystem with 8 gigs of battery backed cache onboard, the needs
for partitioning are less and less necessary.  Not completely so, and
there are times when they can come in handy (i.e. when you have to "roll
your own" on a more limited budget).

I'm note sure what your point about purging and loading is.  A properly
built home rolled partitioning setup reqiures none of that.  it's just
that whereas Oracle does it semi-automagically, the postgresql dba sets
up the equivalent by hand.  No purging or loading that I'm aware of is
needed.

Witness how often OTHER issues pop up (=NULL versus IS NULL, autonomous
transactions, quotas) that are from oracle land nowadays.  It isn't that
the Oracle way is always the best way, it's just what folks are often
used to.

I really didn't see anything in your post that argued against my point
that a large enterprise class raid array largely eliminates the needs
for application level partitioning of data.

Re: partitionning

From

Scott Marlowe

Date:

09 March 2005, 19:28:38

On Wed, 2005-03-09 at 13:21, Joshua D. Drake wrote:
> Scott Marlowe wrote:
>
> >On Wed, 2005-03-09 at 10:31, FERREIRA William (COFRAMI) wrote:
> >
> >
> >>hi
> >>
> >>does the table partitionning exists under PostgreSQL ? or maybe an
> >>alternative exists ?
> >>
> >>Oracle implements this system : it allows to spread rows on differents
> >>partitions depending on an attribute.
> >>For example, my application store xml documents in database, an the
> >>partitionning is used to spread the differents documents on differents
> >>partitions.
> >>
> >>
> >
> >Currently, doing this in PostgreSQL is more of a "roll your own" thing.
> >you create the base tables on different table spaces, then create an
> >updatable view with the proper triggers to make sure that the incoming
> >records go to the right tables.
> >
> >With the advent of very large raid arrays with very fast caching
> >controllers, this methodology is becoming less and less necessary.
> >
> >
> I don't agree with this. There are many postgresql users
> who have 100, 200, 600 GB databases that don't have the budget
> to purchase a 20,000 array.

A Good AMI Megaraid card with 512 meg of battery backed cache will work
the same here.  I.e. eliminate the need for partitioning.  it doesn't
have to cost $20,000 to do it.  And let's never forget that the data is
where the value is, not the hardware.   Notice I wasn't saying there was
never a need anyway, just that it's becoming less of an issue each day.
which it is. Back in the day a decent one scsi interface RAID card with
battery backed cache was >$2,000.  nowadays, they've dropped to the $300
to $500 range.  I can't see the cost of a DBA figuring out and
implementing partitioning schemes costing less than that in time.  Plus
it puts your data on a more reliable system.  Penny wise and pound
foolish to go cheap in my opinion.

Anyone keeping 600 gigs of data and trying to save $500 in hardware
costs is saving costs in the wrong places, imnsho...

> Table partitioning is a way to keep things efficient. That
> should be regardless of technology.

Yes, it is a way.  It's just a less necessary one than it once was, with
hardware now able to provide the same performance increase with little
or no work on the users part.  We've got to weigh the increased
complexity it would take to implement it in Postgresql and maintain it
versus the gain, and I say the gain is smaller every day.

> RAM is cheap, so lets just use as much of it as we can even
> if it means we swap.

What does table partitioning have to do with RAM?

Re: partitionning

From

Greg Stark

Date:

09 March 2005, 20:47:45

Scott Marlowe <smarlowe@g2switchworks.com> writes:

> Actually, I think it is the more common scenario of migrating off of
> oracle or db2 and onto postgresql, and bringing along the experience
> gained there over the years that has caused this refrain to sprout up
> more and more often.  With a database sitting on an enterpries class
> storage subsystem with 8 gigs of battery backed cache onboard, the needs
> for partitioning are less and less necessary.  Not completely so, and
> there are times when they can come in handy (i.e. when you have to "roll
> your own" on a more limited budget).

You don't think the people using Oracle are even *more* likely to have an big
storage subsystem with gobs of cache? At my previous job we had a Hitachi
system that was really ludicrously fast. Nonetheless when we implemented
partitioning it was a real life saver for the DBAs. Their workload went from
being >50% dealing with problems with the large weekly jobs to basically being
able to ignore these jobs. In fact the job to handle the changover was moved
to the middle of the peak period because it was just more convenient that way.

The bigger the storage the *more* important partitioning is. Not less. That's
why it's such a huge feature for Oracle and it's why databases used on smaller
projects don't find it a compelling feature.

> I'm note sure what your point about purging and loading is.  A properly
> built home rolled partitioning setup reqiures none of that.

Well that's sort of the point. But home rolled partitioning setups have other
problems. That's why it would be good to have a solid implementation that
didn't have these problems.

> Witness how often OTHER issues pop up (=NULL versus IS NULL, autonomous
> transactions, quotas) that are from oracle land nowadays.  It isn't that
> the Oracle way is always the best way, it's just what folks are often
> used to.

=NULL is an Access thing actually. But yes, these other features are also
things that the bigger boys need. But the most common requested one these days
seems to be partitioning. Maybe I'm biased though.

> I really didn't see anything in your post that argued against my point
> that a large enterprise class raid array largely eliminates the needs
> for application level partitioning of data.

Firstly it's not application level if it's native. The postgres options such
as inherited tables or union views do indeed impose application level
constraints, but a good native implementation is completely transparent to the
programmer.

Well, so you're saying that you believe me that on my 1GB database I find it
more convenient to be able to pull off 100M of data instantaneously and
without generating any garbage for vacuum to clean up. But that you don't
believe someone running a 1TB storage subsystem would appreciate the same
feature as much when they have to pull off 10GB of data because their system
is 10x faster at doing this unnecessary work than mine would be, so it only
takes 100x as much time?

--
greg

Re: partitionning

From

Scott Marlowe

Date:

09 March 2005, 21:07:26

On Wed, 2005-03-09 at 14:47, Greg Stark wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
>
> > Actually, I think it is the more common scenario of migrating off of
> > oracle or db2 and onto postgresql, and bringing along the experience
> > gained there over the years that has caused this refrain to sprout up
> > more and more often.  With a database sitting on an enterpries class
> > storage subsystem with 8 gigs of battery backed cache onboard, the needs
> > for partitioning are less and less necessary.  Not completely so, and
> > there are times when they can come in handy (i.e. when you have to "roll
> > your own" on a more limited budget).
>
> You don't think the people using Oracle are even *more* likely to have an big
> storage subsystem with gobs of cache? At my previous job we had a Hitachi
> system that was really ludicrously fast. Nonetheless when we implemented
> partitioning it was a real life saver for the DBAs. Their workload went from
> being >50% dealing with problems with the large weekly jobs to basically being
> able to ignore these jobs. In fact the job to handle the changover was moved
> to the middle of the peak period because it was just more convenient that way.

Wait, I'm not sure I understand your point here yet.  Are you saying
something along the lines of that with a 1TB storage array, and putting
all the data in one big partitions, the DBAs had problems, but when they
partitioned it down to say 10 100G partitions, and split up the data set
across it they had fewer problems?

Or are you talking more about quick access because when you have a table
with three fields, and one is rather narrow, that field can be accessed
much quicker because you aren't dragging around the extra fields you
don't need?

Was this some kind of massive batch processing system?

> The bigger the storage the *more* important partitioning is. Not less. That's
> why it's such a huge feature for Oracle and it's why databases used on smaller
> projects don't find it a compelling feature.
>
> > I'm note sure what your point about purging and loading is.  A properly
> > built home rolled partitioning setup reqiures none of that.
>
> Well that's sort of the point. But home rolled partitioning setups have other
> problems. That's why it would be good to have a solid implementation that
> didn't have these problems.

While I'm sure administration overhead is the major problem, I'm
wondering what other ones you've seen, i.e. performance, reliability,
user error, that are more common on a home rolled partitioning.

> > I really didn't see anything in your post that argued against my point
> > that a large enterprise class raid array largely eliminates the needs
> > for application level partitioning of data.
>
> Firstly it's not application level if it's native.

sorry, by application, I meant Postgresql client layer.  I.e.
implemented in the user layer of postgresql, not in the core of it.

> Well, so you're saying that you believe me that on my 1GB database I find it
> more convenient to be able to pull off 100M of data instantaneously and
> without generating any garbage for vacuum to clean up. But that you don't
> believe someone running a 1TB storage subsystem would appreciate the same
> feature as much when they have to pull off 10GB of data because their system
> is 10x faster at doing this unnecessary work than mine would be, so it only
> takes 100x as much time?

Why in the world would a view encased union of several tables generate
garbage for a vacuum to clean up?  I'd think a well rolled
view/union/trigger based solution would operated identically to a
postgresql internally implemented partitioning system.

But what I'm really saying is that between good home grown partitioning
and fast hardware, the need for the pg devel team to implement
partitioning is pretty low.

Re: partitionning

From

Greg Stark

Date:

09 March 2005, 23:29:51

Scott Marlowe <smarlowe@g2switchworks.com> writes:

> Wait, I'm not sure I understand your point here yet.  Are you saying
> something along the lines of that with a 1TB storage array, and putting
> all the data in one big partitions, the DBAs had problems, but when they
> partitioned it down to say 10 100G partitions, and split up the data set
> across it they had fewer problems?

I'm not sure we're talking about the same thing here.

Partitioned tables are typically used when the data falls naturally into
different buckets and the different buckets need to be treated differently.
Especially when all the data in a single bucket often needs to be moved en
masse.

A typical scenarios are for accounting systems that need to make past year's
books read-only or archive detail records for past years en masse. Without
partitioning you're stuck doing massive deletes to archive the old data for an
entire year. The bigger your database (and I imagine if you're paying $100k+
for a storage subsystem you have a pretty massive database) the more such data
you're going to be trying to delete at once.

A similar situation also arises when you have to reload this data -- either
from a backup or for a DSS system backing your primary database. If you want
to access the data in a single normalized table you're stuck doing massive
inserts to move the data around.

With partitioned tables you can pull an entire partition out of the table with
a single DDL command. It's instantaneous, it doesn't even have to look at the
data in the partition. Similarly you can load data into your table quickly
(once it's been loaded into the database that is.)

I think it's obvious that these features are particularly useful for large
databases. Especially ones that have the ability to grow very quickly or load
lots of data from elsewhere very quickly. Ie, precisely the ones that have
humongous fast storage subsystems.

Yes you can implement this using views of unions in postgres. Or you can
implement it using inherited tables. But implementing it by hand has
disadvantages.

Oracle already went through this. They had DBAs using union views as early as
version 7. They even had an optimizer hack that you could enable to take
advantage of it. But it was a tremendous kludge. The real support they added
in 8.1 is *much* better.

> Was this some kind of massive batch processing system?

Actually it was a web site, so an OLTP database. About 1-3M new records per
day in the most important table. Before we implemented partitioning we had a
regular job that spent 10+ hours deleting old records. If it failed in any way
things became very unhappy quickly.

After partitioning we could create a new partition and move the old partition
out of the table at peak hours. Then we could back up and drop the table
containing the old partition's data at our leisure.

> While I'm sure administration overhead is the major problem, I'm
> wondering what other ones you've seen, i.e. performance, reliability,
> user error, that are more common on a home rolled partitioning.

Well we never even tried to implement it as a home brew solution. I certainly
would have been pretty afraid of trying anything so complex myself with live
data.

> But what I'm really saying is that between good home grown partitioning
> and fast hardware, the need for the pg devel team to implement
> partitioning is pretty low.

Ah. I thought you were saying that the fast hardware made partitioning in any
form unnecessary. Not merely that it made home brew partitioning an acceptable
solution.

But that's a bit of a silly proviso though isn't it? I mean you could do
anything with enough plpgsql code and fast enough hardware. The real question
is where is the best place for this to be implemented.

Issuing a single atomic command sure makes me feel much better about something
than trying to set up a half dozen triggers/rules on a view and hoping I get
it all set up right. Especially when you think that I'll probably have to do
this for several tables at the same time.

Actually I have a strong feeling what really _ought_ to happen here is that
the inherited tables support in postgres, which never really worked anyways,
should be deprecated and eventually removed. All that infrastructure should be
repurposed into partitioned tables. That seems like it would be a nice fit.

--
greg

Re: partitionning

From

Greg Stark

Date:

09 March 2005, 23:40:05

Scott Marlowe <smarlowe@g2switchworks.com> writes:

> Yes, it is a way.  It's just a less necessary one than it once was, with
> hardware now able to provide the same performance increase with little
> or no work on the users part.  We've got to weigh the increased
> complexity it would take to implement it in Postgresql and maintain it
> versus the gain, and I say the gain is smaller every day.

Now I think you're contradicting your argument in the other subthread. It's
certainly *much* more complex to have to implement this yourself for each
table than to have it as a native postgres feature. So I think you're saying
using partitioning in any form, whether native or home-brew, is better because
of the simplicity.

But if that's the argument then you're wrong about the high end controllers
making this less urgent. High end hardware controllers only make it easier to
gather the kind of data that requires some form of partitioning in one form or
another to make it manageable.

In any case partitioning offers algorithmic improvements in performance. No
matter how fast your controller is it's not going to be able to delete 100G of
data and match the speed of simply dropping a partition using DDL.

Partitioning is something DBAs are doing more and more often as the data sets
grow. And it's something Postgres DBAs are doing more and more often as
Postgres moves into problem domains that were previously the domain of Oracle
and DB2 DBAs. The only choice is whether they're doing it by kludging a
failure-prone and suboptimal system or whether it's built into the database in
a reliable, convenient, and well designed form.

--
greg

Re: partitionning

From

Joe Conway

Date:

10 March 2005, 09:13:09

Greg Stark wrote:
>
> Actually I have a strong feeling what really _ought_ to happen here is that
> the inherited tables support in postgres, which never really worked anyways,
> should be deprecated and eventually removed. All that infrastructure should be
> repurposed into partitioned tables. That seems like it would be a nice fit.
>

I don't know about deprecating inheritance, but I agree with pretty much
everything Greg has said on this thread. In particular, I have felt for
several years now that the inheritance infrastructure could be used to
implement table partitioning. We're using inheritance for DIY table
partitioning on very expensive storage hardware (~$500K), and we'd be
dead in the water without it.

Joe

Re: partitionning

From

Scott Marlowe

Date:

10 March 2005, 14:52:02

On Wed, 2005-03-09 at 17:29, Greg Stark wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:

> > But what I'm really saying is that between good home grown partitioning
> > and fast hardware, the need for the pg devel team to implement
> > partitioning is pretty low.
>
> Ah. I thought you were saying that the fast hardware made partitioning in any
> form unnecessary. Not merely that it made home brew partitioning an acceptable
> solution.
>
> But that's a bit of a silly proviso though isn't it? I mean you could do
> anything with enough plpgsql code and fast enough hardware. The real question
> is where is the best place for this to be implemented.

Well, this is PostgreSQL, an extensible database, so the answer, to me,
is to implement it with a simple set of UDFs in userland as a proof of
concept.  much like the materialized views recently discussed here.

After that, if someone thinks the basic concept is sound, it should get
migrated into the backend.

> Issuing a single atomic command sure makes me feel much better about something
> than trying to set up a half dozen triggers/rules on a view and hoping I get
> it all set up right. Especially when you think that I'll probably have to do
> this for several tables at the same time.

Sure, I'd love that too.  But I think it's a bit too far ahead of the
game right now.

> Actually I have a strong feeling what really _ought_ to happen here is that
> the inherited tables support in postgres, which never really worked anyways,
> should be deprecated and eventually removed. All that infrastructure should be
> repurposed into partitioned tables. That seems like it would be a nice fit.

I would imagine that both might be saved by the same task.  i.e. if
indexes and triggers could span across multiple tables etc... then
partitioned tables would be a pretty simple view creation.

Re: partitionning

From

Karsten Hilbert

Date:

11 March 2005, 09:24:28

> Actually I have a strong feeling what really _ought_ to happen here is that
> the inherited tables support in postgres, which never really worked anyways,
> should be deprecated and eventually removed.
Hopefully not. They are useful for other things, too.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: partitionning

From

Mike Rylander

Date:

12 March 2005, 14:22:49

Back to the original question on this thread, and using PG 8.0.1.
Perhaps someone would like to poke holes in this (other than the need
to set up fkeys and indexes on the inherited tables...):

-------------------------------------------------------------------------------

begin;
create schema partition_test;
set search_path to partition_test,public;

create table test_base ( id serial, value text, partitioner timestamp
with time zone default now());
create table test_2005_03 () inherits (test_base);
create table test_2005_04 () inherits (test_base);

create rule base_partitioner_test_2005_03 as on insert
        to test_base where partitioner AT TIME ZONE 'UTC' between
'2005-03-01'::timestamp and '2005-04-01'::timestamp
                do instead insert into test_2005_03 values (NEW.*);

create rule base_partitioner_test_2005_04 as on insert
        to test_base where partitioner AT TIME ZONE 'UTC' between
'2005-04-01'::timestamp and '2005-05-01'::timestamp
                do instead insert into test_2005_04 values (NEW.*);

insert into test_base (value) values ('first string');
insert into test_base (value, partitioner) values ('a string',
'2004-01-30 10:17:08');
insert into test_base (value, partitioner) values ('a string',
'2005-04-01 14:17:08');

explain analyze select * from test_base;
select tableoid::regclass,* from test_base;

rollback;

-------------------------------------------------------------------------------

This could easily be wrapped up in a
'create_partition_by_date(base_table,partition_start,partition_length)'
function, I just haven't gotten around to that part yet.  The function
could even look up the indexes and fkeys on the base table using the
INFORMATION_SCHEMA views.

One thing to note about PG not having indexes across tables, if we do
get in-memory bitmap indexes I believe that the indexes on each
inherited table would actually be combined (if the planner sees it as
a win).

Comments?

On Thu, 10 Mar 2005 12:59:35 +0100, Karsten Hilbert
<Karsten.Hilbert@gmx.net> wrote:
> > Actually I have a strong feeling what really _ought_ to happen here is that
> > the inherited tables support in postgres, which never really worked anyways,
> > should be deprecated and eventually removed.
> Hopefully not. They are useful for other things, too.
>
> Karsten
> --
> GPG key ID E4071346 @ wwwkeys.pgp.net
> E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org

Re: partitionning

From

Christian Kratzer

Date:

12 March 2005, 16:39:52

Hi,

On Sat, 12 Mar 2005, Mike Rylander wrote:

> Back to the original question on this thread, and using PG 8.0.1.
> Perhaps someone would like to poke holes in this (other than the need
> to set up fkeys and indexes on the inherited tables...):
>
> -------------------------------------------------------------------------------
>
> begin;
> create schema partition_test;
> set search_path to partition_test,public;
>
> create table test_base ( id serial, value text, partitioner timestamp
> with time zone default now());
> create table test_2005_03 () inherits (test_base);
> create table test_2005_04 () inherits (test_base);
>
> create rule base_partitioner_test_2005_03 as on insert
>        to test_base where partitioner AT TIME ZONE 'UTC' between
> '2005-03-01'::timestamp and '2005-04-01'::timestamp
>                do instead insert into test_2005_03 values (NEW.*);
>
> create rule base_partitioner_test_2005_04 as on insert
>        to test_base where partitioner AT TIME ZONE 'UTC' between
> '2005-04-01'::timestamp and '2005-05-01'::timestamp
>                do instead insert into test_2005_04 values (NEW.*);
>
> insert into test_base (value) values ('first string');
> insert into test_base (value, partitioner) values ('a string',
> '2004-01-30 10:17:08');
> insert into test_base (value, partitioner) values ('a string',
> '2005-04-01 14:17:08');
>
> explain analyze select * from test_base;
> select tableoid::regclass,* from test_base;
>
> rollback;
>
> -------------------------------------------------------------------------------

any ideas how we would get

     select * from test_base where partitioner between a and b

to just look in the correct partition and not doing a full sweep over
the other partitions or the base.

Greetings
Christian

--
Christian Kratzer                       ck@cksoft.de
CK Software GmbH                        http://www.cksoft.de/
Phone: +49 7452 889 135                 Fax: +49 7452 889 136

Re: partitionning

From

Mike Rylander

Date:

12 March 2005, 20:24:30

On Sat, 12 Mar 2005 17:39:38 +0100 (CET), Christian Kratzer
<ck-lists@cksoft.de> wrote:
> Hi,
>
> On Sat, 12 Mar 2005, Mike Rylander wrote:
>
> > Back to the original question on this thread, and using PG 8.0.1.
> > Perhaps someone would like to poke holes in this (other than the need
> > to set up fkeys and indexes on the inherited tables...):
> >
> > -------------------------------------------------------------------------------
> >
> > begin;
> > create schema partition_test;
> > set search_path to partition_test,public;
> >
> > create table test_base ( id serial, value text, partitioner timestamp
> > with time zone default now());
> > create table test_2005_03 () inherits (test_base);
> > create table test_2005_04 () inherits (test_base);
> >
> > create rule base_partitioner_test_2005_03 as on insert
> >        to test_base where partitioner AT TIME ZONE 'UTC' between
> > '2005-03-01'::timestamp and '2005-04-01'::timestamp
> >                do instead insert into test_2005_03 values (NEW.*);
> >
> > create rule base_partitioner_test_2005_04 as on insert
> >        to test_base where partitioner AT TIME ZONE 'UTC' between
> > '2005-04-01'::timestamp and '2005-05-01'::timestamp
> >                do instead insert into test_2005_04 values (NEW.*);
> >
> > insert into test_base (value) values ('first string');
> > insert into test_base (value, partitioner) values ('a string',
> > '2004-01-30 10:17:08');
> > insert into test_base (value, partitioner) values ('a string',
> > '2005-04-01 14:17:08');
> >
> > explain analyze select * from test_base;
> > select tableoid::regclass,* from test_base;
> >
> > rollback;
> >
> > -------------------------------------------------------------------------------
>
> any ideas how we would get
>
>         select * from test_base where partitioner between a and b
>
> to just look in the correct partition and not doing a full sweep over
> the other partitions or the base.

There is now way to have PG completely skip the unused partitions.
However, with an index on the "partitioner" column of each partition
table PG will basically fall of the end of the index almost
imediately, and then use an appropiate plan for the useful partitions.
 It's not the same as Oracle partitioning, but without expicit Planner
work it can't be.

It should be about the same speed as one gigantic table, perhaps
faster for some queries where there are many total partitions and few
useful ones, but for my money that's not where the biggest benefit of
partitioning lies.  My favorite feature of classic partitioning is
that you can load and unload large chunks of the data without
affecting everyone else;  you can very easily archive unused
partitions or transfer them to a statistics server for data mining
with no impact on other writers.

Another benefit would to be the ability to put the currently
write-active partition on a very fast/expensive storage area and move
the older data to a slower/cheaper storage area using tablespaces.

--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org

Re: partitionning

From

Martijn van Oosterhout

Date:

12 March 2005, 23:19:08

On Sat, Mar 12, 2005 at 08:24:20PM +0000, Mike Rylander wrote:
> There is now way to have PG completely skip the unused partitions.
> However, with an index on the "partitioner" column of each partition
> table PG will basically fall of the end of the index almost
> imediately, and then use an appropiate plan for the useful partitions.
>  It's not the same as Oracle partitioning, but without expicit Planner
> work it can't be.

Note, the planner does have most of the intelligence needed to
determine when it can skip a partition. It has code that can determine
if the intersection of:

Partition constraint:    lower bound < value < upper bound
Query clause:            value [operator] constant

Will be always false or can be true. It's just a matter of getting the
clause to the right place to make it skip...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-94294.dat

Re: partitionning

From

Mike Rylander

Date:

13 March 2005, 01:06:18

On Sun, 13 Mar 2005 00:18:39 +0100, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Sat, Mar 12, 2005 at 08:24:20PM +0000, Mike Rylander wrote:
> > There is now way to have PG completely skip the unused partitions.
> > However, with an index on the "partitioner" column of each partition
> > table PG will basically fall of the end of the index almost
> > imediately, and then use an appropiate plan for the useful partitions.
> >  It's not the same as Oracle partitioning, but without expicit Planner
> > work it can't be.
>
> Note, the planner does have most of the intelligence needed to
> determine when it can skip a partition. It has code that can determine
> if the intersection of:
>
> Partition constraint:    lower bound < value < upper bound
> Query clause:            value [operator] constant
>
> Will be always false or can be true. It's just a matter of getting the
> clause to the right place to make it skip...

I see.  I should have qualified my statement with "*as of now* there
is no way".  If I felt comfortable digging into the planner I'd offer
to take that on, but...

Thanks for pointing out that detail!

--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org

Re: partitionning

From

Tom Lane

Date:

13 March 2005, 04:36:05

Martijn van Oosterhout <kleptog@svana.org> writes:
> Note, the planner does have most of the intelligence needed to
> determine when it can skip a partition.

The issue isn't really whether the planner *could* use a constraint
to decide that a table need not be scanned at all.  As you say, we
have practically all the infrastructure needed for such inferences.
To me the real reason why we don't do that is the lack of a mechanism
to force the plan to be reconsidered if the constraint is dropped.
Without the constraint, the plan will still run, but it may deliver
wrong answers.  (The closest current equivalent is the possibility of
dropping an index that a plan depends on --- but at least we will fail
outright if the index isn't there anymore.)

In short, I won't accept any such patch until after we build a mechanism
for invalidating cached plans.  Which we need anyway for other reasons.
So the path ahead is clear, in my mind anyway ... but this is not the
next step to take.

            regards, tom lane

Re: partitionning

From

Martijn van Oosterhout

Date:

13 March 2005, 14:14:16

On Sat, Mar 12, 2005 at 11:35:51PM -0500, Tom Lane wrote:
> The issue isn't really whether the planner *could* use a constraint
> to decide that a table need not be scanned at all.  As you say, we
> have practically all the infrastructure needed for such inferences.
> To me the real reason why we don't do that is the lack of a mechanism
> to force the plan to be reconsidered if the constraint is dropped.
> Without the constraint, the plan will still run, but it may deliver
> wrong answers.  (The closest current equivalent is the possibility of
> dropping an index that a plan depends on --- but at least we will fail
> outright if the index isn't there anymore.)

Wow, now that's a corner case I hadn't though of. Actually, it seems to
me a similar thing may happen if you have a query on a table and you
create a new inherited table from that. Is the new table included in
this old plan? There are any number of DDL statements that can affect
planning. All of them I guess...

> In short, I won't accept any such patch until after we build a mechanism
> for invalidating cached plans.  Which we need anyway for other reasons.
> So the path ahead is clear, in my mind anyway ... but this is not the
> next step to take.

People still keep running into the invalid plan issue, consider
temporary tables. Would if be feasable to have a plan maintain a list
of objects it depends on, every column, table, constraint, domain,
function, type, etc referred to and regenerate the plan if any of them
change. This list could become huge for any moderatly complicated query
and there doesn't seem to be a lot of gain most of the time. But it
would be needed for completeness.

Actually, it would probably be enough to just list tables and columns
and have changes in constraints, triggers and default values be changes
on the columns they refer to.

I'm sure this has been discussed to death already though and is just
waiting for a suitable implementation.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

msg-32091-94311.dat