Table Partitioning, Part 1 - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Table Partitioning, Part 1 |
Date | |
Msg-id | 1115672573.3830.73.camel@localhost.localdomain Whole thread Raw |
List | pgsql-hackers |
Many people have been discussing Table Partitioning lately. I've also been giving thought to how to implement Table Partitioning within PostgreSQL, as part of the Bizgres project for Business Intelligence. After some discussion on Bizgres, I've now posted the most important and common Use Cases for Business Intelligence. Input from Hannu Krosing, Josh Berkus, Luke Lonergan, Bricklen Anderson, as well as noting comments from Stacey White, Yann Michel on other lists. This can be found at: http://www.bizgres.org/pages.php?pg=developers%7Cprojects%7Cbizgres% 7Cpartitioningrequirements It's now time to look in detail at how to implement some of the internals of Partitioning within PostgreSQL. I'd like to kick off with some of the trickiest design decisions we need to make, and comment on what the implications are for each choice. Rather than go for a hugely long post, I'll start with a summary of the main questions...then continue some more in a second post soon afterwards, going into more detail. (Thats why the subject of this says "Part 1"). 1. Embellish inheritance, or separate infrastructure? 2. Individual Relations explicitly in the plan or MultiRelation plan nodes? (i.e. is a SeqScan of a Partitioned Table one Node or many nodes?) 3. Partitions are relations? Or sub-relations? 4. Should all partitions have the same indexes? 5. Constraints or specific Partitioning syntax? Those are the main questions, as I see them currently. I'll go through them in order, though (2) turns out to be the biggest question. Some of those questions have been raised by Neil Conway, Hannu Krosing, Jim Nasby, Josh Berkus and Gavin Sherry. I answered many of those questions at the time, though now I want to discuss these again since they are such important questions. Some of the reason that this is fairly complex is that the Partitioning Use Cases (PPUC) published show that we need to take advantage of both: PPUC1 Direct Restriction of the Partitioning Key PPUC2 Joins on the Partitioning Key More detail can be found at the link (same as above)... http://www.bizgres.org/pages.php?pg=developers%7Cprojects%7Cbizgres% 7Cpartitioningrequirements Many of us will have experience with similar applications, though frequently with just one or other of those Use Cases. At first, I mistakenly thought the Use Cases were identical, though their technical solutions do seem to be fairly different for the db internals. In terms of other implementations, it appears that Oracle was designed for Direct Restriction of the Partitioning Key (PPUC1), whereas the latest SQLServer implementation appears more suitable for cases involving Joins on the Partitioning Key (PPUC2). Thats not too important, just that if you know either of those database systems you may be tempted into thinking that you know how partitioning should be done by reference to one of those implementations (as I did initially). I'm also striving to find a subset of this that can be completed for PostgreSQL 8.1, with the majority being completed for 8.2. (Bizgres releases *may* vary from that schedule) Best Regards, Simon Riggs
pgsql-hackers by date: