Thread: Index/Function organized table layout
Hi, I am running a large and mature mission-critical PostgreSQL implementation with multiple tables that are growing at a rate of several million rows per month (the rate of addition is growing as well). It is a 24x7 operation, so downtime is a party foul and apparent performance has to be reasonably good all the time. The problem: My working set is typically several million rows (and growing) at any one time, which has a tendency to thrash the buffers mercilessly. Records are inserted in an order that does not reflect typical retrieval such that a typical query has to hit many, many blocks to collect all the rows in a given range query. CLUSTER isn't cutting it (more below). Definitely sub-optimal. Now, I've actually hacked commercial MVCC engines back in the day, and am comfortable playing around in database internals. I have an "itch to scratch" for improving the scalability of Really Large Tables by explicitly allowing control of table layouts as an optional property of the tables. I would like some feedback as to whether this is practical given the current architecture; it appears that it is, but I'm not intimately familiar with it (yet). 1.) B-tree organized tables. The primary key index also contains the entire row. Upside: Very fast retrieval for indexed queries over large data sets and very efficient buffer usage. Downside: In practice, the primary key index can be the only index on the table, and the locking is more like an index than data page. I've actually used this feature a lot for very large tables in Oracle implementations, and you can get substantial speed improvements versus normal indexed tables if used correctly on large tables. Vastly more convenient for a 24x7 operation than using the CLUSTER command, which creates nasty contention, doesn't automatically order the table on insert/update (i.e. it has to be run regularly), and basically ends up with an index that is redundant. 2.) Function/Hash organized tables (aka Partitioning). Partitioning tuples across blocks according to a user provided function is the second target on my short list. Doing this well (i.e. being able to take individual partitions offline or setting them read-only) takes a hell of a lot of work, but having a table that internally clustered its records according to a user provided hash function is a good start. This isn't a beginner feature (stupidly designed transactions can be quite costly on partitioned tables), but it is a very nice tool to have. True partitioning would be enterprise-grade gold, but that target may be a bit high for now. Both of these things really are attempts to address the same basic problem, which is optimizing the number of buffers a given query uses by making the tables layout reflect typical queries. Given the size of our typical working sets, optimizing layout greatly improves performance by reducing buffer thrashing. I would very much like to work on improving this type of feature (automatic table layout optimization), but I thought I would check with the people currently hacking on the system first, to see if there was a showstopper or if someone is already working on this. Cheers, -James Rogersjamesr@best.com
On Tue, Sep 30, 2003 at 11:31:26PM -0700, James Rogers wrote: > The problem: My working set is typically several million rows (and growing) > at any one time, which has a tendency to thrash the buffers mercilessly. > Records are inserted in an order that does not reflect typical retrieval > such that a typical query has to hit many, many blocks to collect all the > rows in a given range query. CLUSTER isn't cutting it (more below). > Definitely sub-optimal. I have a situation that is very similar to yours, and my problems are very similar to yours. I hope some of your ideas are implementable in some way, because I think they would solve some of my problems too. > 1.) B-tree organized tables. The primary key index also contains the > entire row. I think this is called a clustered index on some other database systems. Basically you want to replace the content of the btree item with the whole tuple, instead of the pointer to the heap element which contains the tuple. One thing to keep in mind is that index tuples are limited to BLCKSZ/3 IIRC, so this limits the size of tuples that can be put in such a "table". TOASTing may help alleviate this problem. As for other indexes, I'm not sure why you say this precludes the use of other indexes. The only thing they have to do is keep pointers to index elements, instead of heap elements. Doesn't sound impossible to me. Another thing to keep in mind: L&Y requires unique keys. In the current btree implementation this is worked around using the pointers-to-heap (ctid?) to differentiate items that have the same key. If you use a clustered index you won't have this pointer; maybe it will be required that this index is marked UNIQUE. This may not be a too onerous restriction, given that the index is a primary key after all. I don't have anything to say about 2). -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "No deja de ser humillante para una persona de ingenio saber que no hay tonto que no le pueda enseñar algo." (Jean B. Say)
James Rogers <jamesr@best.com> writes: > Now, I've actually hacked commercial MVCC engines back in the day, and am > comfortable playing around in database internals. I have an "itch to > scratch" for improving the scalability of Really Large Tables by explicitly > allowing control of table layouts as an optional property of the tables. I > would like some feedback as to whether this is practical given the current > architecture; it appears that it is, but I'm not intimately familiar with it I think you'd need to do some basic architectural work first. Right now we have a clean API for index access methods, but there is no comparable abstraction layer for heaps (tables). It'd probably be necessary to create such a layer in order to allow different heap organizations to be supported. Another point of confusion is that heaps and indexes are rigidly distinct. Perhaps some heaps could be considered to be indexes as well, in order to support your idea of b-tree-organized tables. Doing that would undoubtedly break a few places though. > Both of these things really are attempts to address the same basic problem, > which is optimizing the number of buffers a given query uses by making the > tables layout reflect typical queries. Hm, are you sure that smarter buffer management wouldn't serve the purpose? regards, tom lane
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > As for other indexes, I'm not sure why you say this precludes the use of > other indexes. The only thing they have to do is keep pointers to index > elements, instead of heap elements. Doesn't sound impossible to me. However, btree feels free to move index entries around while inserting other entries. I'm not sure that you could usefully use "ctid" as an identifier for tuples in a btree-organized heap. This will break more things than just other indexes :-( ... UPDATE chaining for one example. > Another thing to keep in mind: L&Y requires unique keys. In the > current btree implementation this is worked around using the > pointers-to-heap (ctid?) to differentiate items that have the same key. > If you use a clustered index you won't have this pointer; maybe it will > be required that this index is marked UNIQUE. IIRC this assumption is really only used when re-finding the parent downlink after a page split, and so we were able to get rid of it by looking for the downlink by child block number, without using the key at all. So that part doesn't bother me. The mutability of index ctids seems like a much worse problem. regards, tom lane
On Wed, Oct 01, 2003 at 11:37:38AM -0400, Tom Lane wrote: > James Rogers <jamesr@best.com> writes: > > Both of these things really are attempts to address the same basic problem, > > which is optimizing the number of buffers a given query uses by making the > > tables layout reflect typical queries. > > Hm, are you sure that smarter buffer management wouldn't serve the > purpose? It doesn't help when there a lot of access locality in searching. In my case I want to select some thousands of records that were inserted very apart from each other, but are logically very near. Having this pseudoheap that is ordered by definition helps very much with the selection; the current heap requires me to bring to buffers lots of uninteresting tuples, whichever buffer management algorithm is used, because they are in the same page as interesting tuples. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Linux transformó mi computadora, de una `máquina para hacer cosas', en un aparato realmente entretenido, sobre el cual cada día aprendo algo nuevo" (Jaime Salinas)
On Wed, 2003-10-01 at 09:29, Alvaro Herrera wrote: > On Wed, Oct 01, 2003 at 11:37:38AM -0400, Tom Lane wrote: > > Hm, are you sure that smarter buffer management wouldn't serve the > > purpose? > > It doesn't help when there a lot of access locality in searching. In my > case I want to select some thousands of records that were inserted very > apart from each other, but are logically very near. Having this > pseudoheap that is ordered by definition helps very much with the > selection; the current heap requires me to bring to buffers lots of > uninteresting tuples, whichever buffer management algorithm is used, > because they are in the same page as interesting tuples. Yes, what Alvaro said. For very large tables that routinely run modest range queries, it can be very expensive in terms of cache efficiency if tuples that are closely grouped and ordered logically are scattered throughout the heap. The requirement to buffer a lot of unrelated data for typical case queries can greatly reduce the cache hit rate if the active portion of the data is already quite large relative to the physical RAM available. To give a real world example, a standard query on one of our tables that has not been CLUSTER-ed recently (i.e. within the last several days) generates an average of ~2,000 cache misses. Recently CLUSTER-ed, it generates ~0 cache misses on average. Needless to say, one is *much* faster than the other. The problem is that the number of buffers required to satisfy this query with the tuples scattered is enough to make it swap out the buffers of another competing query on another table that is also running. The result is that performance grinds to a halt as processes are competing with each other and trying to swap out each others buffers, resulting in a lot less *actual* buffering than should be occurring given the amount of data actually being queried. In my case, not only does CLUSTER-ing increase the number of concurrent queries possible without disk thrashing by an integer factor, but the number of buffers touched on a query that generates a cache misses is greatly reduced as well. The problem is that CLUSTER-ing is costly and index-organizing some of the tables would reduce the buffer needs, since the index tuple in these cases are almost as large as the heap tuples they reference. The classic scenario for this is when you have a large collection of time-series data stored in a table, with each series keyed to another table. The the typical tuple distribution creates pathological behaviors when buffer space becomes tight. Cheers, -James Rogersjamesr@best.com
On Wed, 2003-10-01 at 08:37, Tom Lane wrote: > I think you'd need to do some basic architectural work first. Right now > we have a clean API for index access methods, but there is no comparable > abstraction layer for heaps (tables). It'd probably be necessary to > create such a layer in order to allow different heap organizations to be > supported. Another point of confusion is that heaps and indexes are > rigidly distinct. Perhaps some heaps could be considered to be indexes > as well, in order to support your idea of b-tree-organized tables. > Doing that would undoubtedly break a few places though. Ahhh, okay. So it would require an architectural mod. Still, it would probably be a worthwhile project (for me at least). There are a number of enterprise features that rely on a similar architectural premise that would also be good to have eventually. We are at the point at this company where we need some of these features in our databases for scalability and management purposes, but we aren't all that eager to migrate to Oracle (the alternative). From my standpoint, it would be substantially cheaper to actively work on adding the features needed to Postgres than to move to Oracle to get the same features. I clearly need to take a good look at the relevant source. > > Both of these things really are attempts to address the same basic problem, > > which is optimizing the number of buffers a given query uses by making the > > tables layout reflect typical queries. > > Hm, are you sure that smarter buffer management wouldn't serve the > purpose? No, the problem isn't really buffer management; the buffer usage behaviors are largely valid for the situation. The problem is that the number of relevant tuples per buffer is low for a given query, so we aren't getting much tuple bang for the buffer buck. I suppose one could look at it as trying to improve the intra-buffer hit rate for a given query. Cheers, -James Rogersjamesr@best.com
James Rogers kirjutas N, 02.10.2003 kell 20:50: > To give a real world example, a standard query on one of our tables that > has not been CLUSTER-ed recently (i.e. within the last several days) > generates an average of ~2,000 cache misses. Recently CLUSTER-ed, it > generates ~0 cache misses on average. Needless to say, one is *much* > faster than the other. So what you really need is the CLUSTER command to leave pages half-empty and the tuple placement logic on inserts/updates to place new tuples near the place where they would be placed by CLUSTER. I.e. the code that does actual inserting should be aware of CLUSTERing. I guess that similar behaviour (half-empty pages, or even each second page empty which is better as it creates less dirty buffers) could also significantly speed up updates on huge number of tuples, as then code could always select a place near the old one and thus avoid needless head-movements between reading and writing areas. > In my case, not only does CLUSTER-ing increase the number of concurrent > queries possible without disk thrashing by an integer factor, but the > number of buffers touched on a query that generates a cache misses is > greatly reduced as well. The problem is that CLUSTER-ing is costly and > index-organizing some of the tables would reduce the buffer needs, since > the index tuple in these cases are almost as large as the heap tuples > they reference. True, but my above suggestion would be much easier to implement near-term. It seems to be a nice incremental improvement just needs to touch places: 1. selecting where new tuples go : * updated ones go near old ones if not clustered and near the place CLUSTER would place them if clustered. * inserted ones go to the less than half-empty pages if not clustered and near the place CLUSTER would place them if clustered. 2. making reorganization code (CLUSTER and VACUUM FULL) to leave space in pages for clustered updates/inserts. the "half" above could of course mean anything from 10% to 95% depending on access patterns. --------------------- Hannu
On Thu, 2003-10-02 at 12:09, Hannu Krosing wrote: > So what you really need is the CLUSTER command to leave pages half-empty > and the tuple placement logic on inserts/updates to place new tuples > near the place where they would be placed by CLUSTER. I.e. the code that > does actual inserting should be aware of CLUSTERing. Not exactly. What you are describing is more akin to partitioning or hash-organized tables i.e. sorting insert/update tuples to various pages according to some hash function. This works well if you a table is composed of many well-defined working sets but the access and query patterns for a given working set are varied. A common example of this is for large accounting tables, which are frequently partitioned by a function which truncates the record timestamp to the year/month. Note that indexes can be partitioned as well, so that as long as you are doing queries within a given month (using the accounting example), the query performance is generally what you would expect if the entire table was the size of one month's data, no matter how big the table actually gets. Partitions and hash-organized tables (for RDBMS that support them) are often handled internally as multiple tables masquerading as one (kind of like a view), with the hash function determining which table is actually being accessed. This allows the possibility of doing things like locking individual partitions or setting them "read-only", and generally having the option of treating them as individual tables for administrative purposes e.g. deleting a partition of old unused data without adversely affecting the "active" partition in the way it would if they were truly one table, even if they look like one table from a SQL-level view. What I really need in my specific case is B-Tree organized tables, though hashing/partitioning would help too. It *is* a similar kind of problem. B-Tree organized tables basically make the table and the primary index the same thing, and tend to be most useful for tables that use a single multi-column primary key index for queries. This has the effect of putting all the tuples for a typical query in the same small number of heap pages (and therefore buffers), allowing very efficient access in the typical case with the caveat that non-indexed queries will be quite slow. B-Tree organized tables are particularly useful when the insert order is orthogonal to the typical query order. As I mentioned before, tables that store parallel collections of time-series data are classic examples. Often the data is inserted into the pages in order that can roughly be described as (timestamp, id), but is queried using (id, timestamp) as the index. If you have enough ids, you end up with the pathological case where you typically have one relevant tuple per page for a given query. > True, but my above suggestion would be much easier to implement > near-term. It seems to be a nice incremental improvement just needs > to touch places: > > 1. selecting where new tuples go : > > * updated ones go near old ones if not clustered and near the place > CLUSTER would place them if clustered. > > * inserted ones go to the less than half-empty pages if not clustered > and near the place CLUSTER would place them if clustered. > > 2. making reorganization code (CLUSTER and VACUUM FULL) to leave space > in pages for clustered updates/inserts. The nuisance would be keeping track of which pages are collecting which tuples. Knowing the CLUSTER index doesn't tell you much about which pages would currently be a good place to put a new tuple. You could always markup the index that CLUSTER uses to keep track of good candidates (plus some additional structures), but the more I think about that, the more it looks like a nasty hack. Cheers, -James Rogersjamesr@best.com
On Thu, Oct 02, 2003 at 10:09:12PM +0300, Hannu Krosing wrote: > So what you really need is the CLUSTER command to leave pages half-empty > and the tuple placement logic on inserts/updates to place new tuples > near the place where they would be placed by CLUSTER. I.e. the code that > does actual inserting should be aware of CLUSTERing. Oh, you mean like what I half-proposed in http://archives.postgresql.org/pgsql-general/2002-06/msg00968.php BTW, this is the one message (the one on the CVS log I cite at the end) that got me into Postgres hacking. The little involvement I've had has been great. Thanks to all PostgreSQL hackers! -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "El realista sabe lo que quiere; el idealista quiere lo que sabe" (Anonimo)
James Rogers kirjutas N, 02.10.2003 kell 23:44: > On Thu, 2003-10-02 at 12:09, Hannu Krosing wrote: > > So what you really need is the CLUSTER command to leave pages half-empty > > and the tuple placement logic on inserts/updates to place new tuples > > near the place where they would be placed by CLUSTER. I.e. the code that > > does actual inserting should be aware of CLUSTERing. > > > Not exactly. What you are describing is more akin to partitioning or > hash-organized tables i.e. sorting insert/update tuples to various pages > according to some hash function. What I actually thought I was describing is how CLUSTER should work in a postgres flavour of MVCC storage ;). Not the CLUSTER command, but the whole feature. > B-Tree organized tables basically make the table and the primary index > the same thing, and tend to be most useful for tables that use a single > multi-column primary key index for queries. This has the effect of > putting all the tuples for a typical query in the same small number of > heap pages (and therefore buffers), allowing very efficient access in > the typical case with the caveat that non-indexed queries will be quite > slow. AFAICS we could resolve this problem (querying indexes only) by keeping a copy of visibility info (tmin,tmax,...) in index tuples. This would make index updates bigger and thus slower, so this should be optional. If you then put all fields in primary key, then the main table could be dropped. If there is no "data" table then no other indexes would then be allowed, or they must be "double-indexes" referencing the primary key, not tuple and thus even bigger ... > B-Tree organized tables are particularly useful when the insert order is > orthogonal to the typical query order. As I mentioned before, tables > that store parallel collections of time-series data are classic > examples. Often the data is inserted into the pages in order that can > roughly be described as (timestamp, id), but is queried using (id, > timestamp) as the index. If you have enough ids, you end up with the > pathological case where you typically have one relevant tuple per page > for a given query. But if we had clustered the table on (id, timestamp), then the data would be in right order for queries, if cluster worked well. > The nuisance would be keeping track of which pages are collecting which > tuples. Knowing the CLUSTER index doesn't tell you much about which > pages would currently be a good place to put a new tuple. You could > always markup the index that CLUSTER uses to keep track of good > candidates (plus some additional structures), but the more I think about > that, the more it looks like a nasty hack. Yeah, index-organized tables seems exact fit for your problem, but then my abstract idea of what clustering should do is exactly that - keep tuples in roughly the same order as an index ;) So what really is needed is a smart tuple-placer which can keep tuples that are close (as defined by index) together in a small number of pages. These pages themselves need not be coninuous, they can be sprinkled around in the whole data table, but they need to stay clusters of index-close tuples. ------------ Hannu
On 10/2/03 11:34 PM, "Hannu Krosing" <hannu@tm.ee> wrote: > James Rogers kirjutas N, 02.10.2003 kell 23:44: >> Not exactly. What you are describing is more akin to partitioning or >> hash-organized tables i.e. sorting insert/update tuples to various pages >> according to some hash function. > > What I actually thought I was describing is how CLUSTER should work in a > postgres flavour of MVCC storage ;). Not the CLUSTER command, but the > whole feature. Yeah, I can see this. "Clustering" doesn't really imply ordering, just tuple proximity in the heap. Loosely implemented by default (i.e. only grouping a tuple if it is cheap to do so at insert/update time) on a table's primary key might give measurable query performance improvement in the typical case without impacting the average performance of queries that use other indexes. Even if the tuples were not ordered in the heap, tuple proximity in the heap would solve a significant percentage of the performance issue that index-organized tables solve i.e. greatly improved cache efficiency. > AFAICS we could resolve this problem (querying indexes only) by keeping > a copy of visibility info (tmin,tmax,...) in index tuples. This would > make index updates bigger and thus slower, so this should be optional. It wouldn't be a feature you would want to use anyway if actually indexing a table. But yes, adding the necessary support to make an index page masquerade as a proper heap page would be most of the effort. That and making the SQL execution efficiently make use of the fact that the index and the table are the same relation with two different interfaces, such that you don't have to modify or access "both" when using either one. > If you then put all fields in primary key, then the main table could be > dropped. If there is no "data" table then no other indexes would then be > allowed, or they must be "double-indexes" referencing the primary key, > not tuple and thus even bigger ... Putting an index on an index is pretty horrible. But quite frankly, if you actually require a second index on a table, then B-Tree organized tables are not what you need. > But if we had clustered the table on (id, timestamp), then the data > would be in right order for queries, if cluster worked well. Right, and it doesn't even have to be in perfect index order really, as what I'm really trying to do is significantly decrease the amount of buffers required for a typical query. Having to order the tuples later is a small matter. > Yeah, index-organized tables seems exact fit for your problem, but then > my abstract idea of what clustering should do is exactly that - keep > tuples in roughly the same order as an index ;) It would be a good approximation of what index-organizing aims to accomplish. > So what really is needed is a smart tuple-placer which can keep tuples > that are close (as defined by index) together in a small number of > pages. These pages themselves need not be coninuous, they can be > sprinkled around in the whole data table, but they need to stay clusters > of index-close tuples. More or less, yes. Cheers, -James Rogersjamesr@best.com
James Rogers <jamesr@best.com> writes: > On 10/2/03 11:34 PM, "Hannu Krosing" <hannu@tm.ee> wrote: > > > > What I actually thought I was describing is how CLUSTER should work in a > > postgres flavour of MVCC storage ;). Not the CLUSTER command, but the > > whole feature. > > > Yeah, I can see this. "Clustering" doesn't really imply ordering, just > tuple proximity in the heap. So I'm a bit confused about the term "Clustering". It seems Postgres uses it to mean simply ordering the tuple storage within an otherwise normal table. However in other databases it seems to mean something more complex. I've never used it in Oracle, but from what I read it seems Oracle thinks "clustering" means storing the tuples for one table in the heap for *another* table entirely. The typical usage scenario envisioned is to store the tuples for child records near the parent record in a related table. Ie, say you have a users table with an ACL list, each user has possibly many ACL entries on the ACL list. You could cluster the ACL entry list onto the users table so that the entries for a given user would be stored *in the users table heap* near the record for that user. I've never seen anyone use this feature, and I never seriously considered it myself. It sort of has the feel of an antiquated feature that traded too much flexibility and abstraction for raw performance on very slow disk hardware. However I wonder if the "nested tables" feature doesn't use it under the hood though. It seems they would both be useful for the same types of tables. I'm not sure what this means for Postgres. I'm not sure if Postgres should use a different name to avoid confusion and possibly to leave room in the future for the possibility of supporting something like this. Or perhaps something like this would be useful for Postgres now or in the near future? Or perhaps the consensus is as I said, that this is an old idea that no longer gets any respect and postgres should just pretend it doesn't exist? -- greg
Greg Stark <gsstark@mit.edu> writes: > So I'm a bit confused about the term "Clustering". It seems Postgres uses it > to mean simply ordering the tuple storage within an otherwise normal table. > However in other databases it seems to mean something more complex. My take is that "clustering" means not only placing related tuples near each other, but taking steps to preserve that organization over time. PG is really misusing the term because our current CLUSTER command does the first but not the second. If tuple insertions were to try to preserve the heap ordering induced by the latest CLUSTER command, then I think we'd have something closer to what is usually meant by the term. > I've never used it in Oracle, but from what I read it seems Oracle > thinks "clustering" means storing the tuples for one table in the heap > for *another* table entirely. I think that's an implementation detail rather than the fundamental meaning. regards, tom lane
Greg Stark kirjutas P, 05.10.2003 kell 00:17: > I've never seen anyone use this feature, and I never seriously considered it > myself. It sort of has the feel of an antiquated feature that traded too much > flexibility and abstraction for raw performance on very slow disk hardware. Read "A Conversation with Jim Gray" referenced from this slashdot article: http://slashdot.org/article.pl?sid=03/09/17/1246255&mode=thread&tid=126 for info on how disk drives are slower than ever (relatively), and how one should treat them as such, especially for large data volumes. > However I wonder if the "nested tables" feature doesn't use it under the hood > though. It seems they would both be useful for the same types of tables. > > I'm not sure what this means for Postgres. I'm not sure if Postgres should use > a different name to avoid confusion and possibly to leave room in the future > for the possibility of supporting something like this. Or perhaps something > like this would be useful for Postgres now or in the near future? Or perhaps > the consensus is as I said, that this is an old idea that no longer gets any > respect and postgres should just pretend it doesn't exist? We can't pretend CLUSTER does not exist until we have some better technology to offer instead. ------------ Hannu
Hannu Krosing wrote: > Greg Stark kirjutas P, 05.10.2003 kell 00:17: > >>I've never seen anyone use this feature, and I never seriously considered it >>myself. It sort of has the feel of an antiquated feature that traded too much >>flexibility and abstraction for raw performance on very slow disk hardware. > > > Read "A Conversation with Jim Gray" referenced from this slashdot > article: > http://slashdot.org/article.pl?sid=03/09/17/1246255&mode=thread&tid=126 > for info on how disk drives are slower than ever (relatively), and how > one should treat them as such, especially for large data volumes. Too bad PostgreSQL is misspelled ("Postgress") and MySQL dominates the open source discussion. And the MySQL questions are coming from: "David Patterson, who holds the Pardee Chair of Computer Science at the University of California at Berkeley." Outrageous! :-) Mike Mascari mascarm@mascari.com
Hi, I've been going through the source tree (pretty nice, BTW) and have found a number of things I'd like to work on. My primary interest is in adding performance and scalability features to the underlying engine. However, one thing I haven't found is a roadmap/schedule for major architectural features i.e. while some of the features could be incrementally hacked into the current source pretty transparently, others might generate a bigger footprint. I guess what I am wondering is whether there is any kind of quasi-official mechanism for putting certain features in certain future versions depending on the nature and significance of implementing those features and the number of things it touches. Features I am interested in working on in the relative short-term: * Buffer Management - A more optimal/adaptive cache replacement algorithm - Use of fine-grained adaptive latching to increaseefficiency on SMP systems (particularly 4+ processor systems) - Add a background cache writer process to opportunisticallyflush dirty buffers * Heap Organization - Hash partitioned heaps - BTree organized heaps While these should generate some performance improvements on smaller databases, more important to me is that it will go along way to making it scale more like Oracle when dealing with very large and active systems (my current problem). It has been a few years, but I've done implementations of most of these features in software before. So I guess I have two questions: 1) Does anyone object to me working on these two areas? 2) What version target should I realistically be shooting for? Cheers, -James Rogersjamesr@best.com
James Rogers wrote: > While these should generate some performance improvements on smaller > databases, more important to me is that it will go along way to making > it scale more like Oracle when dealing with very large and active > systems (my current problem). It has been a few years, but I've done > implementations of most of these features in software before. > > So I guess I have two questions: > > 1) Does anyone object to me working on these two areas? Sounds good. I assume you have read the developers FAQ. I would ask on the list about each item to we can give you tips and get you in touch with anyone else who knows about that topic. > 2) What version target should I realistically be shooting for? They would be all 7.5. We are only doing bug fixes now. There is no limit to what can be added to a major release, if that is your concern. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
James Rogers kirjutas T, 07.10.2003 kell 01:48: > Hi, > > I guess what I am wondering > is whether there is any kind of quasi-official mechanism for putting > certain features in certain future versions depending on the nature and > significance of implementing those features and the number of things it > touches. I don't know of any (except discussing it om [HACKERS] ;), but I very much like the way Pyhton does it via PEPs (Python Enchancement Proposals) see: http://www.python.org/peps/ . ----- Hannu