Re: Partitioning - Mailing list pgsql-performance
From | Greg Stark |
---|---|
Subject | Re: Partitioning |
Date | |
Msg-id | 87y8jarhoz.fsf@stark.xeocode.com Whole thread Raw |
In response to | Partitioning ("J. Andrew Rogers" <jrogers@neopolitan.com>) |
Responses |
Re: Partitioning
|
List | pgsql-performance |
"J. Andrew Rogers" <jrogers@neopolitan.com> writes: > We do something very similar, also using table inheritance I have a suspicion postgres's table inheritance will end up serving as a good base for a partitioned table feature. Is it currently possible to query which subtable a record came from though? > A big part of performance gain is that the resulting partitions end up being > more well-ordered than the non-partitioned version, since inserts are hashed > to different partition according to the key and hash function. It is kind of > like a cheap and dirty real-time CLUSTER operation. There is also one particular performance gain that cannot be obtained via other means: A query that accesses a large percentage of a single partition can use a sequential table scan of just that partition. This can be several times faster than using an index scan which is the only option if all the data is stored in a single large table. This isn't an uncommon occurrence. Consider an accounting table partitioned by accounting period. Any aggregate reports for a single accounting period fall into this category. If you define your partitions well that can often by most or all of your reports. Of course this applies equally if the query is accessing a small number of partitions. A further refinement is to leverage the partitioning in GROUP BY or ORDER BY clauses. If you're grouping by the partition key you can avoid a large sort without having to resort to an index scan or even a hash. And of course it's tempting to think about parallelization of such queries, especially if the partitions are stored in separate table spaces on different drives. > It also lets you truncate, lock, and generally be heavy-handed with subsets > of the table without affecting the rest of the table. The biggest benefit by far is this management ability of being able to swap in and out partitions in a single atomic transaction that doesn't require extensive i/o. In the application we used them on Oracle 8i they were an absolute life-saver. They took a huge batch job that took several days to run in off-peak hours and turned it into a single quick cron job that could run at peak hours. We were able to cut the delay for our OLTP data showing up in the data warehouse from about a week after extensive manual work to hours after a daily cron job. > ...PARTITION ON 'date_trunc(''hour'',ts)'... > > There would also probably need to be some type of metadata table to > associate specific hashes with partition table names. Other than that, > the capabilities largely already exist, and managing the partition > hashing and association is the ugly part when rolling your own. > Intercepting DML when necessary and making it behave correctly is > already pretty easy, but could probably be streamlined. I would suggest you look at the Oracle syntax to handle this. They've already gone through several iterations of implementations. The original Oracle 7 implementation was much as people have been describing where you had to define a big UNION ALL view and enable an option to have the optimizer look for such views and attempt to eliminate partitions. In Oracle 8i they introduced first class partitions with commands to define and manipulate them. You defined a high bound for each partition. In Oracle 9 (or thereabouts, sometime after 8i at any rate) they introduced a new form where you specify a specific constant value for each partition. This seems to be more akin to how you're thinking about things. The optimizer has several plan nodes specific for partitioned tables. It can select a single known partition based on information present in the query. It can also detect cases where it can be sure the query will only access a single partition but won't be able to determine which until execution time based on placeholder parameters for cases like "WHERE partition_key = ?". It can also detect cases like "WHERE partition_key between ? and ?" and "WHERE partition_key IN (?,?,?)" Or join clauses on partitions. It can also do some magic things with "GROUP BY partition_key" and "ORDER BY partition_key". The work in the optimizer will be the most challenging part. In an ideal world if the optimizer is very solid it will be possible to bring some partitions to slow or even near-line storage media. As long as no live queries accidentally access the wrong partitions the rest of the database need never know that the data isn't readily available. -- greg
pgsql-performance by date: