Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What we need first is an explicit representation of partitioning, and
> then to build routing code on top of that. I haven't looked at
> Itagaki-san's syntax patch at all, but I think it's at least starting
> in a sensible place.
I have the following development plan for partitioning.
I'll continue to use inherits-based partitioning... at least in 8.5.
8.5 Alpha 3: Syntax and catalog changes (on-disk structure). I think pg_dump is the biggest stopper in the phase.
8.5 Alpha 4: Internal representation (on-memory structure), that will replace insert-triggers first, and also
replaceCHECK constraints if possible (but probably non-INSERT optimizations will slide to 8.6).
The internal representation of RANGE partitions will be an array of
pairs of { upper-value, parition-relid } for each parent table.
An insert target partition are determined using binary search on insert.
It will be faster than sequential checks of CHECK constraint
especially in large number of child tables. The array will be kept
in CacheMemoryContext or query context to reduce access to the system
catalog. RelationData or TupleDesc will have an additional field for it.
> * It only applies to COPY. You'd certainly want routing for INSERT as
> well. And it shouldn't be necessary to specify an option.
Sure. We need the routingin both INSERT and COPY. Even if Emmanuel-san's
patch will be committed in Alpha 3, the code would be discarded in Alpha 4.
> * Building this type of infrastructure on top of independent, not
> guaranteed consistent table constraints is just throwing more work
> into a dead end.
I think the current approach is not necessarily wrong for CHECK-based
partitioning, but I'd like to have more specialized or generalized
functionality for the replacement of triggers.
If we will take specialized approach, triggers will be replaced with
built-in feature. We can only use RANGE and LIST partitions.
On the other hand, it might be interesting to take some generalized
approach; For example, spliting BEFORE INSERT triggers into 3 phases: 1. Can cancel the insert and modify the new
tuple.2. Can cancel the insert, but cannot modify tuple. 3. Neigher can cancel nor modify.
We call triggers in the number order. INSERT TRIGGERs are implemented
in 2nd phase, so we're not afraid of modifing partition keys.
(3rd phase will be used for replication trigger.)
However, I think generalized one is overkill.
A specialized approach would be enough.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center