Re: Sponsoring enterprise features - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: Sponsoring enterprise features |
Date | |
Msg-id | 871xs02gaf.fsf@stark.dyndns.tv Whole thread Raw |
In response to | Re: Sponsoring enterprise features (Josh Berkus <josh@agliodbs.com>) |
List | pgsql-hackers |
Josh Berkus <josh@agliodbs.com> writes: > I'm a little unclear, personally, about what can be accomplished through table > partitioning that we can't currently do through partial indexes and inherited > tables, especially after Gavin finishes his tablespaces patch (btw, Gavin > could use sponsorship on that one, I think). Can you make your case to > me/the list? So far, the only arguments we've gotten on this list have been > of the "Oracle does it that way" variety so it'd be interesting to see > something concrete. Well probably everyone who wants it is saying things of the form "they were a useful with Oracle because...". Which isn't the same thing as "Oracle does it this way". I don't particularly care how partitioned tables are *implemented*, only the net effect. You can think of them as an abstraction over inherited tables that let the database guarantee your data integrity and offer query optimizations in a way it cannot if you build it by hand. I know for us they were an absolute godsend. The main advantages over a single monolithic table even with partial indexes are: 1) Being able to load and unload parts of the table quickly. Adding and removing a partition is basically a DDL operation, not DML. It doesn't have to visit every tuple and mark itdeleted or added. It just has to add or remove the entire partition to the structure. Partitioned tables are frequently used for aging out old data. The common example is of having a partition per month andkeeping 3-12 months of data. We had a more extreme case where we had one partition per day and kept 21 days of data. When we implemented partitioned tables the time to archive and delete the old data went from taking most of thenight and killing production performance to effectively instantaneous and we were able to run it at peak time. 2) Being able to do a sequential scan of a partition. Sequential scans are faster than index scans. Sometimes much faster. Partial indexes are nice but when they cover 10-20%of your table scanning them is much slower than a sequential scan of a partition. As for inherited tables. Well, I would expect a partitioned tables scheme to be implemented using inherited tables or just using views. You could jury-rig it today using these tools, it would just be very awkward and fragile. The original Oracle implementation in Oracle 7 was implemented much the same way using views. They were a complete hack and required lots of manual tweaking though. The point of partitioned tables is a) The database ensures tuples go into the correct partition. If you used a manually constructed view or inherited tablesyou would always run the risk of inserting into the wrong partition which would break your data integrity. b) The database automatically optimizes queries to query the correct partitions. It detects clauses in the query much likepartial indexes so you don't have to tweak every query by hand and the database can skip clauses that match the partitionclause exactly. Also this is a prime opportunity for the database to introduce parallel queries because each partitioncan be accessed independently. -- greg
pgsql-hackers by date: