Re: Index/Function organized table layout - Mailing list pgsql-hackers
| From | James Rogers | 
|---|---|
| Subject | Re: Index/Function organized table layout | 
| Date | |
| Msg-id | 1065127491.9267.146.camel@localhost.localdomain Whole thread Raw | 
| In response to | Re: Index/Function organized table layout (Hannu Krosing <hannu@tm.ee>) | 
| Responses | Re: Index/Function organized table layout | 
| List | pgsql-hackers | 
On Thu, 2003-10-02 at 12:09, Hannu Krosing wrote: > So what you really need is the CLUSTER command to leave pages half-empty > and the tuple placement logic on inserts/updates to place new tuples > near the place where they would be placed by CLUSTER. I.e. the code that > does actual inserting should be aware of CLUSTERing. Not exactly. What you are describing is more akin to partitioning or hash-organized tables i.e. sorting insert/update tuples to various pages according to some hash function. This works well if you a table is composed of many well-defined working sets but the access and query patterns for a given working set are varied. A common example of this is for large accounting tables, which are frequently partitioned by a function which truncates the record timestamp to the year/month. Note that indexes can be partitioned as well, so that as long as you are doing queries within a given month (using the accounting example), the query performance is generally what you would expect if the entire table was the size of one month's data, no matter how big the table actually gets. Partitions and hash-organized tables (for RDBMS that support them) are often handled internally as multiple tables masquerading as one (kind of like a view), with the hash function determining which table is actually being accessed. This allows the possibility of doing things like locking individual partitions or setting them "read-only", and generally having the option of treating them as individual tables for administrative purposes e.g. deleting a partition of old unused data without adversely affecting the "active" partition in the way it would if they were truly one table, even if they look like one table from a SQL-level view. What I really need in my specific case is B-Tree organized tables, though hashing/partitioning would help too. It *is* a similar kind of problem. B-Tree organized tables basically make the table and the primary index the same thing, and tend to be most useful for tables that use a single multi-column primary key index for queries. This has the effect of putting all the tuples for a typical query in the same small number of heap pages (and therefore buffers), allowing very efficient access in the typical case with the caveat that non-indexed queries will be quite slow. B-Tree organized tables are particularly useful when the insert order is orthogonal to the typical query order. As I mentioned before, tables that store parallel collections of time-series data are classic examples. Often the data is inserted into the pages in order that can roughly be described as (timestamp, id), but is queried using (id, timestamp) as the index. If you have enough ids, you end up with the pathological case where you typically have one relevant tuple per page for a given query. > True, but my above suggestion would be much easier to implement > near-term. It seems to be a nice incremental improvement just needs > to touch places: > > 1. selecting where new tuples go : > > * updated ones go near old ones if not clustered and near the place > CLUSTER would place them if clustered. > > * inserted ones go to the less than half-empty pages if not clustered > and near the place CLUSTER would place them if clustered. > > 2. making reorganization code (CLUSTER and VACUUM FULL) to leave space > in pages for clustered updates/inserts. The nuisance would be keeping track of which pages are collecting which tuples. Knowing the CLUSTER index doesn't tell you much about which pages would currently be a good place to put a new tuple. You could always markup the index that CLUSTER uses to keep track of good candidates (plus some additional structures), but the more I think about that, the more it looks like a nasty hack. Cheers, -James Rogersjamesr@best.com
pgsql-hackers by date: