Thread: possible TODO: read-only tables, select from indexes only.
Fetching data from just indexes has been discussed on this list several times before, and it has been told that this can't be done with postgres thanks to MVCC. But this is true only when data is changing. In a data-warehousing scenario what it is often needed is a possibility for fast querying of static historical data. If table has been VACUUM'ed or REINDEX'ed after last change (update/delete/insert), then there is 1-1 correspondence between table and indexes, and thus no need to fetch the tuple from heap for extra visibility checks. What I am proposing is 1) possibility to explicitly change table status to READ-ONLY . 2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables that are READ-ONLY 3) changing postgres planner/executor to make use of this flag, by not going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true. -- Hannu Krosing <hannu@skype.net>
See this TODO:* Allow data to be pulled directly from indexes Currently indexes do not have enough tuple visibility information to allow data to be pulled from the index without also accessing the heap. One way to allow this is to seta bit to index tuples to indicate if a tuple is currently visible to all transactions when the first valid heap lookuphappens. This bit would have to be cleared when a heap tuple is expired. I think this is the direction we should be heading because it has more general usefulness. --------------------------------------------------------------------------- Hannu Krosing wrote: > Fetching data from just indexes has been discussed on this list several > times before, and it has been told that this can't be done with postgres > thanks to MVCC. > > But this is true only when data is changing. In a data-warehousing > scenario what it is often needed is a possibility for fast querying of > static historical data. > > If table has been VACUUM'ed or REINDEX'ed after last change > (update/delete/insert), then there is 1-1 correspondence between table > and indexes, and thus no need to fetch the tuple from heap for extra > visibility checks. > > What I am proposing is > > 1) possibility to explicitly change table status to READ-ONLY . > > 2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables > that are READ-ONLY > > 3) changing postgres planner/executor to make use of this flag, by not > going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true. > > -- > Hannu Krosing <hannu@skype.net> > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On 4/22/05, Hannu Krosing wrote: > Fetching data from just indexes has been discussed on this list several > times before, and it has been told that this can't be done with postgres > thanks to MVCC. > > But this is true only when data is changing. In a data-warehousing > scenario what it is often needed is a possibility for fast querying of > static historical data. > > If table has been VACUUM'ed or REINDEX'ed after last change > (update/delete/insert), then there is 1-1 correspondence between table > and indexes, and thus no need to fetch the tuple from heap for extra > visibility checks. > > What I am proposing is > > 1) possibility to explicitly change table status to READ-ONLY . > > 2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables > that are READ-ONLY Why stop at indexes? If you switch to readonly status with a CLUSTER or ALTER TABLE command you can even remove the visibility information from the heap tuples. Visibility would be exclusively controlled by the visibility of the table in the catalog, i.e. all or nothing. Jochem
On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote: > See this TODO: > > * Allow data to be pulled directly from indexes > > Currently indexes do not have enough tuple visibility information > to allow data to be pulled from the index without also accessing > the heap. One way to allow this is to set a bit to index tuples > to indicate if a tuple is currently visible to all transactions > when the first valid heap lookup happens. I don't think this is implementable in any reasonably cheap way (i.e. this will be a general performance loss). This has all the downsides of storing full visibility in index tuples, except the size. > This bit would have to be cleared when a heap tuple is expired. Does "expired" here mean marked for deletion ? This can be prohibitively pricey for big tables with lots of indexes, as marking the tuple means alsn finding and possibly writing to all index enytries pointing to this tuple. > I think this is the direction we should be heading because it has more > general usefulness. OTOH this will probably never be implemented for the same reason that full visibility in index tuples will not, whereas my proposition can be implemented quite easily (it's just a SMOP). > --------------------------------------------------------------------------- > > Hannu Krosing wrote: > > Fetching data from just indexes has been discussed on this list several > > times before, and it has been told that this can't be done with postgres > > thanks to MVCC. > > > > But this is true only when data is changing. In a data-warehousing > > scenario what it is often needed is a possibility for fast querying of > > static historical data. > > > > If table has been VACUUM'ed or REINDEX'ed after last change > > (update/delete/insert), then there is 1-1 correspondence between table > > and indexes, and thus no need to fetch the tuple from heap for extra > > visibility checks. > > > > What I am proposing is > > > > 1) possibility to explicitly change table status to READ-ONLY . > > > > 2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables > > that are READ-ONLY > > > > 3) changing postgres planner/executor to make use of this flag, by not > > going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true. > > > > -- > > Hannu Krosing <hannu@skype.net> > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/docs/faq > > -- Hannu Krosing <hannu@skype.net>
Hannu Krosing wrote: > On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote: > > See this TODO: > > > > * Allow data to be pulled directly from indexes > > > > Currently indexes do not have enough tuple visibility information > > to allow data to be pulled from the index without also accessing > > the heap. One way to allow this is to set a bit to index tuples > > to indicate if a tuple is currently visible to all transactions > > when the first valid heap lookup happens. > > I don't think this is implementable in any reasonably cheap way (i.e. > this will be a general performance loss). > > This has all the downsides of storing full visibility in index tuples, > except the size. Yea, I suppose. > > This bit would have to be cleared when a heap tuple is expired. > > Does "expired" here mean marked for deletion ? Yes. > This can be prohibitively pricey for big tables with lots of indexes, as > marking the tuple means alsn finding and possibly writing to all index > enytries pointing to this tuple. Yep, it could be very ugly, but it would help with our COUNT(*) problem too. Isn't there a solution? If there isn't, I can remove the TODO item. > > I think this is the direction we should be heading because it has more > > general usefulness. > > OTOH this will probably never be implemented for the same reason that > full visibility in index tuples will not, whereas my proposition can be > implemented quite easily (it's just a SMOP). I would like to find something that has more general usefulness. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
You should read the archives of this list; there was a pretty long thread about this a few months ago. IIRC the consensus after much debate was that this feature would add benefit in many instances, especially on large tables where only a small amount of data changes. Also, I think there is value to supporting read only tables, in addition to the index visibility info. I also like the idea of being able to have heap tuples that don't have visibility information, though I'm not sure how hard that would be to do. But it would certainly be useful to pull the version bytes out of a 200M row table. On Fri, Apr 22, 2005 at 07:25:19PM +0300, Hannu Krosing wrote: > On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote: > > See this TODO: > > > > * Allow data to be pulled directly from indexes > > > > Currently indexes do not have enough tuple visibility information > > to allow data to be pulled from the index without also accessing > > the heap. One way to allow this is to set a bit to index tuples > > to indicate if a tuple is currently visible to all transactions > > when the first valid heap lookup happens. > > I don't think this is implementable in any reasonably cheap way (i.e. > this will be a general performance loss). > > This has all the downsides of storing full visibility in index tuples, > except the size. > > > This bit would have to be cleared when a heap tuple is expired. > > Does "expired" here mean marked for deletion ? > > This can be prohibitively pricey for big tables with lots of indexes, as > marking the tuple means alsn finding and possibly writing to all index > enytries pointing to this tuple. > > > I think this is the direction we should be heading because it has more > > general usefulness. > > OTOH this will probably never be implemented for the same reason that > full visibility in index tuples will not, whereas my proposition can be > implemented quite easily (it's just a SMOP). > > > --------------------------------------------------------------------------- > > > > Hannu Krosing wrote: > > > Fetching data from just indexes has been discussed on this list several > > > times before, and it has been told that this can't be done with postgres > > > thanks to MVCC. > > > > > > But this is true only when data is changing. In a data-warehousing > > > scenario what it is often needed is a possibility for fast querying of > > > static historical data. > > > > > > If table has been VACUUM'ed or REINDEX'ed after last change > > > (update/delete/insert), then there is 1-1 correspondence between table > > > and indexes, and thus no need to fetch the tuple from heap for extra > > > visibility checks. > > > > > > What I am proposing is > > > > > > 1) possibility to explicitly change table status to READ-ONLY . > > > > > > 2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables > > > that are READ-ONLY > > > > > > 3) changing postgres planner/executor to make use of this flag, by not > > > going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true. > > > > > > -- > > > Hannu Krosing <hannu@skype.net> > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 5: Have you checked our extensive FAQ? > > > > > > http://www.postgresql.org/docs/faq > > > > > -- > Hannu Krosing <hannu@skype.net> > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
Jochem van Dieten wrote: > On 4/22/05, Hannu Krosing wrote: >>...But this is true only when data is changing. In a data-warehousing >>scenario what it is often needed is a possibility for fast querying of >>static historical data. And when we get partitioning, I think many data warehouses will have the bulk of their data in tables like that (previous quarters marked read-only, current quarter growing). >>What I am proposing is >> >>1) possibility to explicitly change table status to READ-ONLY . >>2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables >>that are READ-ONLY > > Why stop at indexes? If you switch to readonly status with a CLUSTER > or ALTER TABLE command you can even remove the visibility information > from the heap tuples. Visibility would be exclusively controlled by > the visibility of the table in the catalog, i.e. all or nothing. Seems this could reduce the size of some data warehouses considerably too. My biggest tables have nothing but columns of IDs; and the 30-some bytes of the row header is a significant fraction of the entire size. I think the diskspace === I/O bandwidth savings on the heap would be almost as big a benefit as the indexes.
On Fri, 2005-04-22 at 18:34 +0300, Hannu Krosing wrote: > Fetching data from just indexes has been discussed on this list several > times before, and it has been told that this can't be done with postgres > thanks to MVCC. > > But this is true only when data is changing. In a data-warehousing > scenario what it is often needed is a possibility for fast querying of > static historical data. > > If table has been VACUUM'ed or REINDEX'ed after last change > (update/delete/insert), then there is 1-1 correspondence between table > and indexes, and thus no need to fetch the tuple from heap for extra > visibility checks. > > What I am proposing is > > 1) possibility to explicitly change table status to READ-ONLY . > > 2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables > that are READ-ONLY > > 3) changing postgres planner/executor to make use of this flag, by not > going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true. > Hannu, I'm with you on this, for all sorts of reasons. I've done a lot of detailed thought on this. As Ron says, until we have partitioning, it isn't as useful for DW. Best Regards, Simon Riggs
On Sat, 23 Apr 2005 03:14 am, Bruce Momjian wrote: > Hannu Krosing wrote: > > On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote: > > > See this TODO: > > > > > > * Allow data to be pulled directly from indexes > > > > > > Currently indexes do not have enough tuple visibility information > > > to allow data to be pulled from the index without also accessing > > > the heap. One way to allow this is to set a bit to index tuples > > > to indicate if a tuple is currently visible to all transactions > > > when the first valid heap lookup happens. > > Storing visibility information in the index has always been put down as a cause of performance problems. Would it be plausible to have an index type that included the information and one that didn't. You could choose which way you wanted to go. I know especially for some tables, I would choose this index with visibility as it would increase performance by not looking at the table at all for that information (count being a good example). However for general purpose indexes I would use the normal index with no visibility information. The possibly of the bit method or full tuples is probably a decision for others, but having the flexibility to choose in this would be a great thing. Regards Russell Smith
Russell Smith wrote: > On Sat, 23 Apr 2005 03:14 am, Bruce Momjian wrote: > > Hannu Krosing wrote: > > > On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote: > > > > See this TODO: > > > > > > > > * Allow data to be pulled directly from indexes > > > > > > > > Currently indexes do not have enough tuple visibility information > > > > to allow data to be pulled from the index without also accessing > > > > the heap. One way to allow this is to set a bit to index tuples > > > > to indicate if a tuple is currently visible to all transactions > > > > when the first valid heap lookup happens. > > > > Storing visibility information in the index has always been put down > as a cause of performance problems. Would it be plausible to have an > index type that included the information and one that didn't. You > could choose which way you wanted to go. > > I know especially for some tables, I would choose this index with > visibility as it would increase performance by not looking at the table > at all for that information (count being a good example). However for > general purpose indexes I would use the normal index with no visibility > information. > > The possibly of the bit method or full tuples is probably a decision > for others, but having the flexibility to choose in this would be a > great thing. I thought a little bit about the complexity. I think we should use a heap bit to say if the index "all visible" bit is set. The idea is that you only set the index "all visible" bit after the tuple has been visible for a while, perhaps 1000 transactions, and a GUC could countrol that number. I assume that will reduce the overhead of clearing the index "all visible" bit on an UPDATE or DELETE to an acceptable load. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > See this TODO: * Allow data to be pulled directly from indexes > I think this is the direction we should be heading because it has more > general usefulness. I think read-only tables would have a few different types of general usefulness in addition to enabling index scans. Is this a fair summary of the potential benefits of READ-ONLY tables? (from both this thread and the archives): 1. Index-only scans are made possible fairly easily because you wouldn't need to check the heap for visibility. 2. Simple tables can be much smaller since you don't need most of the HeapTupleHeaderData. This reduction in space translates to a reduction in I/O through better use of the shared memory and OS caches. 3. A Read-Only Clustered table could be assumed to be sorted, so you could avoid some Sort steps for things likeGroupAggregates and Merge Joins. Any others? The biggest/slowest tables in my database happen to be entirely read only (road network and other GIS features for the country; and data warehouse tables representing previous quarter's data).
Simon Riggs wrote: > > As Ron says, until we have partitioning, it isn't as useful for DW. Well, it's a bit stronger than what I said. The last big DW I dealt with did have previous quarter's data archived into different tables, and those could be marked read-only. Also, quite a few of our ID<->Value lookup tables have a fixed or slowly changing set of values (products, states, etc); and those would benefit from index scans. But yes, it'd be an even nicer feature with partitioning, since then you could hypothetically keep inserting into a mostly-read-only table and mark partitions read-only as needed. I think I was trying to say it'd be even more useful for DW with partitioning - and I hope that when partitioning is being designed it considers the possibility of taking advantage of read-only partitions if we happen to implement read-only tables.
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > Is this a fair summary of the potential benefits of READ-ONLY > tables? (from both this thread and the archives): > 1. Index-only scans are made possible fairly easily because > you wouldn't need to check the heap for visibility. > 2. Simple tables can be much smaller since you don't need > most of the HeapTupleHeaderData. What you are talking about is not a "read only" table, it is a "non-MVCC" table. This is a much greater assault on the fundamental semantics of Postgres than it's being painted to be in this thread. In particular, how is such a table going to come into being? You'd not be able to just flip the READ-ONLY flag on and off. (The notion of having tuples in the system that don't have the standard HeapTupleHeader is not as easy to implement as you might think, either, because that data structure is *everywhere*.) While I don't say it's impossible to do, I do think that the work and semantic ugliness involved would outweigh the possible benefits. In particular, there are other, more transparent ways of doing #1. regards, tom lane
On Sat, Apr 23, 2005 at 06:27:38PM -0400, Tom Lane wrote: > While I don't say it's impossible to do, I do think that the work and > semantic ugliness involved would outweigh the possible benefits. In > particular, there are other, more transparent ways of doing #1. One idea that may be closer to reality might be implementing heaps that behave as indexes (or indexes that behave as heaps), or clustered indexes as some other database call them; the main idea being that at the leaf nodes of the index, the whole heap tuple resides instead of only a CTID. One problem I see with that approach is the maximum tuple size ... our current btree index code can't handle tuples bigger than (BLCKSZ/3) IIRC, some 2 kB. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "Vivir y dejar de vivir son soluciones imaginarias. La existencia está en otra parte" (Andre Breton)
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > One idea that may be closer to reality might be implementing heaps that > behave as indexes (or indexes that behave as heaps), or clustered > indexes as some other database call them; the main idea being that at > the leaf nodes of the index, the whole heap tuple resides instead of > only a CTID. The main problem with this is that you'd not have a stable TID for a given tuple, since it might be forced to move by operations such as index page splits. It might be impractical to support any secondary indexes on such a table --- but I can definitely see that there are applications that wouldn't care. > One problem I see with that approach is the maximum tuple size ... our > current btree index code can't handle tuples bigger than (BLCKSZ/3) > IIRC, some 2 kB. So you toast 'em ... I don't see this as a fatal drawback. regards, tom lane
>> 2. Simple tables can be much smaller since you don't need >> most of the HeapTupleHeaderData. >> >> > >What you are talking about is not a "read only" table, it is a >"non-MVCC" table. This is a much greater assault on the fundamental >semantics of Postgres than it's being painted to be in this thread. >In particular, how is such a table going to come into being? You'd >not be able to just flip the READ-ONLY flag on and off. > > > I can see it now.... Nov 21, 2005: Earlier today, the PostgreSQL Global Development group issued a press release announcing the availability of PostMySQL 10g. PostMySQL 10g continues PostgreSQL's venerable legacy of doing things the Right Way, but brings great performance enhancements to read-only or light-update loads through the introduction of PostMyISAM tables. PostMyISAM tables are eagerly awaited by the data warehousing community because of their more compact on-disk representation and because their lack of tuple visibility information allows queries to be answered directly from the index. The move is widely seen as an effort to gain market share at the expense of MySQL. MySQL AB CEO Marten Mickos was quick to point out that the new features made PostgreSQL much more like MySQL, but that their implementation had serious flaws. "Their [PostMySQL's] implementation is nearly useless in an enterprise OLAP environment. ... In a double-blind test of the two systems, users preferred the mature MySQL implementation that does the Best Thing Under the Circumstances (tm) rather than PostMySQL's fixation with data integrity. While attempting to load the sample data before running benchmarks the user found in 'unbelievably' slow. After consulting with PostgreSQL experts, the user was instructed to tune various default configuration parameters and wrap the inserts in a transaction. After doing so, the loading proceeded at a reasonable speed. "However, it was discovered that one of the records in the 120 gigabyte import data had February 29, 2003 stored in a date field. The PostMySQL database engine refused to store this date, and rolled the entire inserting transaction back. Apparently this is a non-trivial action for the PostMyISAM tables, as after 15 minutes the user thought that the database had locked up and terminated the postmaster process, causing more trouble when the database would not start. [Editor's comment: the corruption was blamed on a bug introduced during the somewhat hasty rewrite of the entire PostgreSQL code base in C++.] After doing initdb again and editing the import data by hand, the user managed to load the data. "In conclusion, ... the new PostMyISAM tables negate all of PostMySQL's claimed ACIDity and concurrency benefits, while we [MySQLAB] offer a mature, stable implementation that requires much less work to configure, is more flexible and forgiving of the surprises found in real-world databases, and enjoys a much larger user community and comprehensive suite of 3rd party tools."
On L, 2005-04-23 at 18:27 -0400, Tom Lane wrote: > Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > > Is this a fair summary of the potential benefits of READ-ONLY > > tables? (from both this thread and the archives): > > > 1. Index-only scans are made possible fairly easily because > > you wouldn't need to check the heap for visibility. > > > 2. Simple tables can be much smaller since you don't need > > most of the HeapTupleHeaderData. > > What you are talking about is not a "read only" table, it is a > "non-MVCC" table. This is a much greater assault on the fundamental > semantics of Postgres than it's being painted to be in this thread. That's why I proposed a much less ambitious, and much more "low-hanging- fruit-first" compliant thing when I satrted the thread - a simple two- flag system to mark a relation as safe to use for index-only queries. Main use of that would be in partiotioned table setups in data warehouses, where older partitions can be switched to read-only state. and such all-or-nothing scheme would also make it much easier to estimate cost of index(only)scan. > In particular, how is such a table going to come into being? You'd > not be able to just flip the READ-ONLY flag on and off. original idea (with HeapTupleHeader intact) was to first mark the table as READ-ONLY, and then either run VACUUM, or preferrably VACUUM-FULL + REINDEX TABLE on it, so that all index tuples point to valid and visible tuples. A check must be made to make sure, that all transactions started before setting the READ-ONLY flag have finished before starting VACUUM or REINDEX. > (The notion of having tuples in the system that don't have the standard > HeapTupleHeader is not as easy to implement as you might think, either, > because that data structure is *everywhere*.) My impression was, that HeapTupleHeader is usually not carried with fields after doing the initial visibility checks ? But if it is needed, then it should be added when generating tuples from index scan, preferrably in such a way, that non-SELECT queries get these tuples with Xids set in a way which prevent them from being modified. > While I don't say it's impossible to do, I do think that the work and > semantic ugliness involved would outweigh the possible benefits. In > particular, there are other, more transparent ways of doing #1. If HeapTupleHeader is essential for pg, than #1 should generate fake HeapTupleHeader with some constant values (xmin=FrozenTransactionId, xmax=MAXINT). BTW, do we really store tableoid column in heap tuples or is it added somewhere on the way from heap ? -- Hannu Krosing <hannu@tm.ee>
On L, 2005-04-23 at 00:02 -0400, Bruce Momjian wrote: > Russell Smith wrote: ... > > I know especially for some tables, I would choose this index with > > visibility as it would increase performance by not looking at the table > > at all for that information (count being a good example). However for > > general purpose indexes I would use the normal index with no visibility > > information. > > > > The possibly of the bit method or full tuples is probably a decision > > for others, but having the flexibility to choose in this would be a > > great thing. > > I thought a little bit about the complexity. I think we should use a > heap bit to say if the index "all visible" bit is set. The idea is that > you only set the index "all visible" bit after the tuple has been > visible for a while, perhaps 1000 transactions, and a GUC could countrol > that number. I assume that will reduce the overhead of clearing the > index "all visible" bit on an UPDATE or DELETE to an acceptable load. Having the full index either CHECK-FROM-HEAP or INDEX-ONLY makes planning decisions much simpler. If the index has some leafs that require heap access and others that don't, we need to also start keeping some statistics about what proportion of index lookups result in heap accesses, possibly correlated with certain index regions. I think that starting from something simple (like read-only tables :) gives most gain for effort. -- Hannu Krosing <hannu@tm.ee>
On R, 2005-04-22 at 13:14 -0400, Bruce Momjian wrote: > > This can be prohibitively pricey for big tables with lots of indexes, as > > marking the tuple means alsn finding and possibly writing to all index > > enytries pointing to this tuple. > > Yep, it could be very ugly, but it would help with our COUNT(*) problem > too. Isn't there a solution? If there isn't, I can remove the TODO > item. > > > > I think this is the direction we should be heading because it has more > > > general usefulness. > > > > OTOH this will probably never be implemented for the same reason that > > full visibility in index tuples will not, whereas my proposition can be > > implemented quite easily (it's just a SMOP). > > I would like to find something that has more general usefulness. So would I :) But I assure you that this would be generally usefull in DataWarehousing applications, where you have to play partitioning tricks anyway and making some sub-table RO and running REINDEX on it would add little complexity.. This would ease my current problem of running queries over tables with more than >100 M rows and getting the results in reasonable time. My setup is a big logical table, which consists of many inherited tables, filled one after another from OLAP database. After each 5 to 10 M rows old table is frozen and new table started. So when I run a query that uses an index, which does not correlate well with primary_key and timestamp, postgres finds the needed rows from index quickly and spends most of the time waiting for seeks from heap- tuple accesses for visibility checks, which is pure wasting of resources as they all succeed. I guess avoiding heap tuple lookups could make some of the queries run 10's maybe 100's of times faster, as index tuples are naturally clustered. -- Hannu Krosing <hannu@tm.ee>
On 4/24/05, Tom Lane wrote: > > What you are talking about is not a "read only" table, it is a > "non-MVCC" table. This is a much greater assault on the fundamental > semantics of Postgres than it's being painted to be in this thread. > In particular, how is such a table going to come into being? You need an operation that rewrites the entire table and all indexes at the same time. That pretty much means the only sensible way would be a special form of CLUSTER (including al the locking uglyness). If you need to make the table READ-WRITE again you need to run CLUSTER against it so the tuples get visibility information again and the fill factor of the indexes is reset to 0.7. But I have to agree with Ron that this probably makes much more sense when you have partitioned tables and do this on partitions instead of a full table. Archived might be a better word for it then read-only. Jochem
I send it now the 3rd time because I'm not sure my mail works, as it has not appeared on pgsql-hackers On L, 2005-04-23 at 18:27 -0400, Tom Lane wrote: > Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > > Is this a fair summary of the potential benefits of READ-ONLY > > tables? (from both this thread and the archives): > > > 1. Index-only scans are made possible fairly easily because > > you wouldn't need to check the heap for visibility. > > > 2. Simple tables can be much smaller since you don't need > > most of the HeapTupleHeaderData. > > What you are talking about is not a "read only" table, it is a > "non-MVCC" table. This is a much greater assault on the fundamental > semantics of Postgres than it's being painted to be in this thread. That's why I proposed a much less ambitious, and much more "low-hanging- fruit-first" compliant thing when I satrted the thread - a simple two- flag system to mark a relation as safe to use for index-only queries. Main use of that would be in partiotioned table setups in data warehouses, where older partitions can be switched to read-only state. and such all-or-nothing scheme would also make it much easier to estimate cost of index(only)scan. > In particular, how is such a table going to come into being? You'd > not be able to just flip the READ-ONLY flag on and off. original idea (with HeapTupleHeader intact) was to first mark the table as READ-ONLY, and then either run VACUUM, or preferrably VACUUM-FULL + REINDEX TABLE on it, so that all index tuples point to valid and visible tuples. A check must be made to make sure, that all transactions started before setting the READ-ONLY flag have finished before starting VACUUM or REINDEX. > (The notion of having tuples in the system that don't have the standard > HeapTupleHeader is not as easy to implement as you might think, either, > because that data structure is *everywhere*.) My impression was, that HeapTupleHeader is usually not carried with fields after doing the initial visibility checks ? But if it is needed, then it should be added when generating tuples from index scan, preferrably in such a way, that non-SELECT queries get these tuples with Xids set in a way which prevent them from being modified. > While I don't say it's impossible to do, I do think that the work and > semantic ugliness involved would outweigh the possible benefits. In > particular, there are other, more transparent ways of doing #1. If HeapTupleHeader is essential for pg, than #1 should generate fake HeapTupleHeader with some constant values (xmin=FrozenTransactionId, xmax=MAXINT). BTW, do we really store tableoid column in heap tuples or is it added somewhere on the way from heap ? -- Hannu Krosing <hannu@skype.net>
On Sat, 2005-04-23 at 15:04 -0700, Ron Mayer wrote: > Bruce Momjian wrote: > > See this TODO: * Allow data to be pulled directly from indexes > > I think this is the direction we should be heading because it has more > > general usefulness. > > I think read-only tables would have a few different types > of general usefulness in addition to enabling index scans. > > Is this a fair summary of the potential benefits of READ-ONLY > tables? (from both this thread and the archives): > > 1. Index-only scans are made possible fairly easily because > you wouldn't need to check the heap for visibility. > > 2. Simple tables can be much smaller since you don't need > most of the HeapTupleHeaderData. This reduction in > space translates to a reduction in I/O through better > use of the shared memory and OS caches. > > 3. A Read-Only Clustered table could be assumed to be > sorted, so you could avoid some Sort steps > for things like GroupAggregates and Merge Joins. 4. No need to VACUUM tables as part of a database-wide VACUUM, which is particularly important for larger databases. Best Regards, Simon Riggs