Josh Berkus <josh@agliodbs.com> writes:
> Well, I would think that specifying an expression that defines a new partition
> at each change in value (like EXTRACT(day FROM timestamp) on a time-based
> partitioning) would cover 90% of implemenations and be a lot simpler to
> administer. The Oracle approach has the advantage of allowing "custom
> paritioning" at the expense of greater complexity.
Hm. This is where I might be less helpful. Once you're submersed in one way of
doing things it can be hard to think outside the box like this.
But I fear this scheme might be harder to actually take advantage of. If I do
a query like
WHERE timestamp BETWEEN '2005-01-01 11:00' AND '2005-01-01 12:00'
How do you determine which partitions that range will cover?
Also, it seems like it would be inconvenient to try to construct expressions
to handle things like "start a new partition ever 1 million values".
And worse, how would you handle changing schemes with this? Like, say we want
to switch from starting one partition per month to starting one partition per
week?
I think some actual use cases might be helpful for you. I can contribute an
interesting one, though I have to be intentionally vague even though I don't
work on that system any more.
We had a table with a layout like:
txnid serial,
groupid integer,
data...
Each day a cron job created 6 new groups (actually later that was changed to
some other number). It then added a new partition to handle the range of the
new day's groups. Later another cron job exchanged out the partition from a
week earlier and exported that table, transfered it to another machine and
loaded it there.
txnid was a unique identifier but we couldn't have a unique constraint because
that would have required a global index. That didn't cause any problems since
it was a sequence generated column anyways.
We did have a unique index on <groupid,txnid> which is a local index because
groupid was the partition key. In reality nothing in our system ever really
needed a txn without knowing which group it came from anyways, so it was easy
to change our queries to take advantage of this.
We had a lot of jobs, some *extremely* performance sensitive that depended on
being able to scan the entire list of txns for a given day or a given set of
groupids. The partitions meant it could do a full table scan which made these
extremely fast.
This was with Oracle 8i. All partition keys in 8i were ranges. In 9 Oracle
added the ability to make partition reference specific id values. Sort of like
how you're describing having a key expression. We might have considered using
that scheme with groupid but then it would have meant adding a bunch of new
partitions each day and having some queries that would involve scanning
multiple partitions.
--
Greg