Thread: Re: [SQL] Yet Another (Simple) Case of Index not used
Dennis, > I'm running into a quite puzzling simple example where the index I've > created on a fairly big table (465K entries) is not used, against all common > sense expectations: > The query I am trying to do (fast) is: > > select count(*) from addresses; PostgreSQL is currently unable to use indexes on aggregate queries. This is because of two factors: 1) MVCC means that the number of rows must be recalculated for each connection's current transaction, and cannot be "cached" anywhere by the database system; 2) Our extensible model of user-defined aggregates means that each aggregate is a "black box" whose internal operations are invisible to the planner. This is a known performance issue for Postgres, and I believe that a couple of people on Hackers are looking at modifying aggregate implementation for 8.0 to use appropriate available indexes, at least for MIN, MAX and COUNT. Until then, you will need to either put up with the delay, or create a trigger-driven aggregates caching table. If you are trying to do a correlated count, like "SELECT type, count(*) from aggregates GROUP BY type", Tom Lane has already added a hash-aggregates structure in the 7.4 source that will speed this type of query up considerably for systems with lots of RAM. (PS: in the future, please stick to posting questions to one list at a time, thanks) -- -Josh Berkus Aglio Database Solutions San Francisco
Josh, I am no database expert, and even less knowledgeable about the internals of postgresql, so I'll trust you on the 2 points you make below. Are you saying the 7.4 'group by' trick would be faster than the simple select count(*)? That seems hard to believe, being that the request now has to fetch / sort the data. I must be missing something. The kind of requests that I am really interested in are: select count(*) from table where table.column like 'pattern%' These seems to go much master on mysql (which I guess it not a MVCC database? or wasn't the Innobase supposed to make it so?), than on postgresql. So, in the meantime, I've decided to split up my data into two sets, the static big tables which are handled by mysql, and the rest of it handled by postgresql.... ps: apologies for the cross-posting. > -----Original Message----- > From: Josh Berkus [mailto:josh@agliodbs.com] > Sent: Tuesday, April 08, 2003 2:53 PM > To: Denis; pgsql-performance@postgresql.org > Subject: Re: [SQL] Yet Another (Simple) Case of Index not used > > > Dennis, > > > I'm running into a quite puzzling simple example where the index I've > > created on a fairly big table (465K entries) is not used, against all common > > sense expectations: > > The query I am trying to do (fast) is: > > > > select count(*) from addresses; > > PostgreSQL is currently unable to use indexes on aggregate queries. This is > because of two factors: > 1) MVCC means that the number of rows must be recalculated for each > connection's current transaction, and cannot be "cached" anywhere by the > database system; > 2) Our extensible model of user-defined aggregates means that each aggregate > is a "black box" whose internal operations are invisible to the planner. > > This is a known performance issue for Postgres, and I believe that a couple of > people on Hackers are looking at modifying aggregate implementation for 8.0 > to use appropriate available indexes, at least for MIN, MAX and COUNT. Until > then, you will need to either put up with the delay, or create a > trigger-driven aggregates caching table. > > If you are trying to do a correlated count, like "SELECT type, count(*) from > aggregates GROUP BY type", Tom Lane has already added a hash-aggregates > structure in the 7.4 source that will speed this type of query up > considerably for systems with lots of RAM. > > (PS: in the future, please stick to posting questions to one list at a time, > thanks) > > -- > -Josh Berkus > Aglio Database Solutions > San Francisco >
On Tue, 8 Apr 2003, Denis @ Next2Me wrote: > The kind of requests that I am really interested in are: > select count(*) from table where table.column like 'pattern%' If you think an index scan should be faster, you can try set enable_seqscan=off; and see if that changes the plan generated by explain and with analyze you can compare the time used. Without information on the estimated selectivity it's hard to say what's right. If it doesn't use the index (ie, it's still using a sequential scan) after the enable_seqscan=off it's likely that you didn't initdb in "C" locale in which case like won't use indexes currently (you can see the archives for long description, but the short one is that some of the locale rules can cause problems with using the index).
Hi Denis, > The kind of requests that I am really interested in are: > select count(*) from table where table.column like 'pattern%' > These seems to go much master on mysql (which I guess it not a MVCC database? or wasn't > the Innobase supposed to make it so?), than on postgresql. A few things. * MVCC in PostgreSQL allows us to be way faster than MySQL when you have heaps of concurrent readers and writers. The tradeoffis that count(*) is slow since PostgreSQL needs to check that each tuple is actually visible to your query (eg. youstart a transaction, somone else inserts a row, you do a count(*) - should the result include that new row or not? Answer:no.) * Just avoid doing count(*) over the entire table with no where clause!!! It's as easy as that * The LIKE 'pattern%' is indexable in Postgresql. You will need to create a normal btree index over table.column. So longas the index is returning a small portion of the table (eg. say only 5-10% of the fields begin with pattern), then theindex will be used and it will be fast. * If you want really fast full text indexing, check out contrib/tsearch - it's really, really, really fast. Chris
Denis, > Are you saying the 7.4 'group by' trick would be faster than the simple > select count(*)? That seems hard to believe, being that the request now has > to fetch / sort the data. I must be missing something. No, I'm saying that the 7.4 hash-aggregate is faster than the same query was under 7.2 or 7.3. Much faster. But it does little to speed up a raw count(*). > The kind of requests that I am really interested in are: > select count(*) from table where table.column like 'pattern%' Hash-aggregates may, in fact, help with this. Care to try downloading the the source from CVS? > These seems to go much master on mysql (which I guess it not a MVCC > database? or wasn't the Innobase supposed to make it so?), They did incorporate a lot of MVCC logic into InnoDB tables, yes. Which means that if SELECT count(*) on an InnoDB table is just as fast as a MyISAM table, then it is not accurate. This would be in keeping with MySQL's design philosophy, which values performance and simplicity over accuracy and precision -- the opposite of our philosophy. > So, in the meantime, I've decided to split up my data into two sets, > the static big tables which are handled by mysql, and the rest of it > handled by postgresql.... Hey, if it works for you, it's probably easier than dealing with the PostgreSQL workarounds to this performance issue. I'll ask you to give PostgreSQL a try for those tables again when 7.4 comes out. > ps: apologies for the cross-posting. De nada. The Performance list is the right place for this sort of question in the future. -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus wrote: > Denis, > > > Are you saying the 7.4 'group by' trick would be faster than the simple > > select count(*)? That seems hard to believe, being that the request now has > > to fetch / sort the data. I must be missing something. > > No, I'm saying that the 7.4 hash-aggregate is faster than the same query was > under 7.2 or 7.3. Much faster. But it does little to speed up a raw > count(*). > > > The kind of requests that I am really interested in are: > > select count(*) from table where table.column like 'pattern%' > > > These seems to go much master on mysql (which I guess it not a MVCC > > database? or wasn't the Innobase supposed to make it so?), > > They did incorporate a lot of MVCC logic into InnoDB tables, yes. > Which means that if SELECT count(*) on an InnoDB table is just as > fast as a MyISAM table, then it is not accurate. This is not necessarily true. The trigger-based approach to tracking the current number of rows in a table might well be implemented internally, and that may actually be much faster than doing it using triggers (the performance losses you saw may well have been the result of PG's somewhat poor trigger performance, and not the result of the approach itself. It would be interesting to know how triggers effect the performance of other databases). -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > Josh Berkus wrote: >> They did incorporate a lot of MVCC logic into InnoDB tables, yes. >> Which means that if SELECT count(*) on an InnoDB table is just as >> fast as a MyISAM table, then it is not accurate. > This is not necessarily true. The trigger-based approach to tracking > the current number of rows in a table might well be implemented > internally, and that may actually be much faster than doing it using > triggers You missed the point of Josh's comment: in an MVCC system, the correct COUNT() varies depending on which transaction is asking. Therefore it is not possible for a centrally maintained row counter to give accurate results to everybody, no matter how cheap it is to maintain. (The cheapness can be disputed as well, since it creates a single point of contention for all inserts and deletes on the table. But that's a different topic.) regards, tom lane
Kevin, Tom: > (The cheapness can be disputed as well, since it creates a single point > of contention for all inserts and deletes on the table. But that's a > different topic.) Actually, this was the problem with the trigger method of maintaining COUNT information in PostgreSQL. The statistics table itself becomes a significant souce of delay, since if a table_A gets 10,000 rows updated than table_count_A must necessarily be updated 10,000 times ... creating a lot of dead tuples and severely attenuating the table on disk until the next vacuum ... resulting in Update #10,000 to table_count_A taking 100+ times as long as Update #1 does, due to the required random seek time on disk. I can personally think of two ways around this: In MySQL: store table_count_A as a non-MVCC table or global variable. Drawback: the count would not be accurate, as you would see changes due to incomplete transactions and eventually the count would be knocked off completely by an overload of multi-user activity. However, this does fit with MySQL's design philosophy of "Speed over accuracy", so I suspect that that's what they're doing. In PostgreSQL: a) Put table_count_A on superfast media like a RAM card so that random seeks after 10,000 updates do not become a significant delay; b) create an asynchronious table aggregates collector which would collect programmed statistics (like count(*) from table A) much in the same way that the planner statistics collector does. This would have the disadvantage of on being up to date when the database is idle, but the advantage of not imposing any significant overhead on Updates. (Incidentally, I proposed this to one of my clients who complained about Postgres' slow aggregate performance, but they declined to fund the effort) -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > In PostgreSQL: > a) Put table_count_A on superfast media like a RAM card so that random seeks > after 10,000 updates do not become a significant delay; As long as we're talking ugly, here ;-) You could use a sequence to hold the aggregate counter. A sequence isn't transactional and so does not accumulate dead tuples. "setval()" and "select last_value" should have constant-time performance. regards, tom lane
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > Josh Berkus wrote: > >> They did incorporate a lot of MVCC logic into InnoDB tables, yes. > >> Which means that if SELECT count(*) on an InnoDB table is just as > >> fast as a MyISAM table, then it is not accurate. > > > This is not necessarily true. The trigger-based approach to tracking > > the current number of rows in a table might well be implemented > > internally, and that may actually be much faster than doing it using > > triggers > > You missed the point of Josh's comment: in an MVCC system, the correct > COUNT() varies depending on which transaction is asking. Therefore it > is not possible for a centrally maintained row counter to give accurate > results to everybody, no matter how cheap it is to maintain. Hmm...true...but only if you really implement it as a faithful copy of the trigger-based method. Implementing it on the backend brings some advantages to the table, to wit: * The individual transactions don't need to update the externally-visible count on every insert or delete, they only need to update it at commit time. * The transaction can keep a count of the number of inserted and deleted tuples it generates (on a per-table basis) during the life of the transaction. The count value it returns to a client is the count value it reads from the table that stores the count value plus any differences that have been applied during the transaction. This is fast, because the backend handling the transaction can keep this difference value in its own private memory. * When a transaction commits, it only needs to apply the "diff value" it stores internally to the external count value. Contention on the count value is only an issue if the external count value is currently being written to by a transaction in the commit phase. But the only time a transaction will be interested in reading that value is when it's performing a count(*) operation or when it's committing inserts/deletes that happened on the table in question (and then only if the number of tuples inserted differs from the number deleted). So the total amount of contention should be relatively low. > (The cheapness can be disputed as well, since it creates a single point > of contention for all inserts and deletes on the table. But that's a > different topic.) That's true, but the single point of contention is only an issue at transaction commit time (unless you're implementing READ UNCOMMITTED), at least if you do something like what I described above. -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > Tom Lane wrote: >> You missed the point of Josh's comment: in an MVCC system, the correct >> COUNT() varies depending on which transaction is asking. Therefore it >> is not possible for a centrally maintained row counter to give accurate >> results to everybody, no matter how cheap it is to maintain. > Hmm...true...but only if you really implement it as a faithful copy of > the trigger-based method. > [ instead have transactions save up net deltas to apply at commit ] Good try, but it doesn't solve the problem. SERIALIZABLE transactions should not see deltas applied by any transaction that commits after they start. READ COMMITTED transactions can see such deltas --- but not deltas applied since the start of their current statement. (And there could be several different "current statements" with different snapshots in progress in a single READ COMMITTED transaction.) AFAICS, central-counter techniques could only work in an MVCC system if each transaction copies every counter in the system at each snapshot freeze point, in case it finds itself needing that counter value later on. This is a huge amount of mostly-useless overhead, and it makes the problem of lock contention for access to the counters several orders of magnitude worse than you'd first think. Of course you can dodge lots of this overhead if you're willing to accept approximate answers. But I don't believe that central counters are useful in an exact-MVCC-semantics system. regards, tom lane
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > Tom Lane wrote: > >> You missed the point of Josh's comment: in an MVCC system, the correct > >> COUNT() varies depending on which transaction is asking. Therefore it > >> is not possible for a centrally maintained row counter to give accurate > >> results to everybody, no matter how cheap it is to maintain. > > > Hmm...true...but only if you really implement it as a faithful copy of > > the trigger-based method. > > [ instead have transactions save up net deltas to apply at commit ] > > Good try, but it doesn't solve the problem. SERIALIZABLE transactions > should not see deltas applied by any transaction that commits after > they start. READ COMMITTED transactions can see such deltas --- but not > deltas applied since the start of their current statement. (And there > could be several different "current statements" with different snapshots > in progress in a single READ COMMITTED transaction.) This is why I suspect the best way to manage this would be to manage the counter itself using the MVCC mechanism (that is, you treat the shared counter as a row in a table just like any other and, in fact, it might be most beneficial for it to actually be exactly that), which handles the visibility problem automatically. But I don't know how much contention there would be as a result. > Of course you can dodge lots of this overhead if you're willing to > accept approximate answers. But I don't believe that central counters > are useful in an exact-MVCC-semantics system. No, but an MVCC-managed counter would be useful in such a system, wouldn't it? Or am I missing something there, too (the deltas themselves would be managed as described, and would be applied as described)? So: how much contention would there be if the counter were managed in exactly the same way as any row of a table is managed? Because I'm not terribly familiar with how PG manages MVCC (pointers to documentation on it welcomed) I can't answer that question myself. -- Kevin Brown kevin@sysexperts.com
Kevin Brown wrote: > >No, but an MVCC-managed counter would be useful in such a system, >wouldn't it? Or am I missing something there, too (the deltas >themselves would be managed as described, and would be applied as >described)? > >So: how much contention would there be if the counter were managed in >exactly the same way as any row of a table is managed? Because I'm >not terribly familiar with how PG manages MVCC (pointers to >documentation on it welcomed) I can't answer that question myself. > > > It looks to me that a "row number -1" really would solve this problem. I think a row counter on each table would be even useful for some kind of auto-vacuum mechanism, that could be triggered if pg_class.reltuples deviates too far from the real row count. Observing this mailing list, missing or outdated statistics still seem to be a major source of performance degradation. We all know these 1000 row estimates from EXPLAIN, don't we? A default vacuum strategy for pgsql newbies should solve a lot of those problems, preventing a lot of "pgsql is slow" threads. Regards, Andreas
Kevin Brown <kevin@sysexperts.com> writes: > This is why I suspect the best way to manage this would be to manage > the counter itself using the MVCC mechanism (that is, you treat the > shared counter as a row in a table just like any other and, in fact, > it might be most beneficial for it to actually be exactly that), which > handles the visibility problem automatically. But I don't know how > much contention there would be as a result. Hm. Contention probably wouldn't be the killer, since if transactions don't try to update the count until they are about to commit, they won't be holding the row lock for long. (You'd have to beware of deadlocks between transactions that need to update multiple counters, but that seems soluble.) What *would* be a problem is that such counter tables would accumulate huge numbers of dead rows very quickly, making it inefficient to find the live row. Josh already mentioned this as a problem with user-trigger-based counting. You could stanch the bleeding with sufficiently frequent vacuums, perhaps, but it just doesn't look very appealing. Ultimately what this comes down to is "how much overhead are we willing to load onto all other operations in order to make SELECT-COUNT(*)-with- no-WHERE-clause fast"? Postgres has made a set of design choices that favor the other operations. If you've designed an application that lives or dies by fast COUNT(*), perhaps you should choose another database. regards, tom lane
Andreas Pflug <Andreas.Pflug@web.de> writes: > I think a row counter on each table would be even useful for some kind > of auto-vacuum mechanism, that could be triggered if pg_class.reltuples > deviates too far from the real row count. It would be counting the wrong thing. auto-vacuum needs to know how many dead tuples are in a table, not how many live ones. Example: UPDATE doesn't change the live-tuple count (without this property, I don't think the sort of count maintenance Kevin is proposing could possibly be efficient enough to be interesting). But it does create a dead tuple that vacuum wants to know about. regards, tom lane
Tom Lane wrote: >It would be counting the wrong thing. auto-vacuum needs to know how >many dead tuples are in a table, not how many live ones. Example: >UPDATE doesn't change the live-tuple count (without this property, >I don't think the sort of count maintenance Kevin is proposing could >possibly be efficient enough to be interesting). But it does create >a dead tuple that vacuum wants to know about. > > > I understand your point, but is this about VACUUM only or VACUUM ANALYZE too? People wouldn't bother about big databases if it's still fast (until the disk is full :-) Do dead tuples affect query planning? I thought the plan only cares about existing rows and their data patterns. So count(*), pg_stat_all_tables.n_tup_ins, .n_tup_upd and .n_tup_del all together can make a VACUUM ANALYZE necessary, right? Regards, Andreas
On Sun, Apr 20, 2003 at 11:21:32AM -0400, Tom Lane wrote: > favor the other operations. If you've designed an application that > lives or dies by fast COUNT(*), perhaps you should choose another > database. Or consider redesigning the application. The "no where clause" restriction sure smacks of poor database normalisation to me. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > This is why I suspect the best way to manage this would be to manage > > the counter itself using the MVCC mechanism (that is, you treat the > > shared counter as a row in a table just like any other and, in fact, > > it might be most beneficial for it to actually be exactly that), which > > handles the visibility problem automatically. But I don't know how > > much contention there would be as a result. > > Hm. Contention probably wouldn't be the killer, since if transactions > don't try to update the count until they are about to commit, they won't > be holding the row lock for long. (You'd have to beware of deadlocks > between transactions that need to update multiple counters, but that > seems soluble.) What *would* be a problem is that such counter tables > would accumulate huge numbers of dead rows very quickly, making it > inefficient to find the live row. But that inefficiency is a problem for *all* oft-updated tables, is it not? I know that you'll end up with an additional n tuples per transaction (where n is the average number of tables inserted into or deleted from per transaction), so this isn't an insignificant problem, but it's one faced by any application that often updates a small table. Causing a transaction which is already doing inserts/deletes to take the hit of doing one additional update doesn't seem to me to be a particularly large sacrifice, especially since the table it's updating (the one that contains the counts) is likely to be cached in its entirety. The chances are reasonable that the other activity the transaction is performing will dwarf the additional effort that maintaining the count demands. > Josh already mentioned this as a problem with user-trigger-based > counting. Right, but the trigger based mechanism probably magnifies the issue by orders of magnitude, and thus can't necessarily be used as an argument against an internally-implemented method. > You could stanch the bleeding with sufficiently frequent vacuums, > perhaps, but it just doesn't look very appealing. I would say this is more a strong argument for automatic VACUUM management than against count management, because what you say here is true of any oft-updated, oft-referenced table. > Ultimately what this comes down to is "how much overhead are we willing > to load onto all other operations in order to make SELECT-COUNT(*)-with- > no-WHERE-clause fast"? Postgres has made a set of design choices that > favor the other operations. If you've designed an application that > lives or dies by fast COUNT(*), perhaps you should choose another > database. Or perhaps a mechanism similar to the one being discussed should be implemented and controlled with a GUC variable, so instead of forcing someone to choose another database you force them to choose between the performance tradeoffs involved. We already give DBAs such choices elsewhere, e.g. pg_stat_activity. The real question in all this is whether or not fast COUNT(*) operations are needed often enough to even justify implementing a mechanism to make them possible in PG. The question of getting fast answers from COUNT(*) comes up often enough to be a FAQ, and that suggests that there's enough demand for the feature that it may be worth implementing just to shut those asking for it up. :-) Personally, I'd rather see such development effort go towards more beneficial improvements, such as replication, 2PC, SQL/MED, etc. (or even improving the efficiency of MVCC, since it was mentioned here as a problem! :-). I consider COUNT(*) without a WHERE clause to be a corner case, despite the frequency of questions about it. But I don't think we should reject a patch to implement fast COUNT(*) just because it represents a performance tradeoff, at least if it's GUC-controlled. -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > Personally, I'd rather see such development effort go towards more > beneficial improvements, such as replication, 2PC, SQL/MED, etc. (or > even improving the efficiency of MVCC, since it was mentioned here as > a problem! :-). I consider COUNT(*) without a WHERE clause to be a > corner case, despite the frequency of questions about it. Exactly. > But I don't > think we should reject a patch to implement fast COUNT(*) just because > it represents a performance tradeoff, at least if it's GUC-controlled. Well, this is moot since I see no one offering to provide such a patch. But performance tradeoffs are only one of the costs involved. I suspect any such mechanism would be invasive enough to represent a nontrivial ongoing maintenance cost, whether anyone uses it or not. The extent to which it munges core functionality would have to be a factor in deciding whether to accept it. It'd take lots more thought than we've expended in this thread to get an accurate handle on just what would be involved... (BTW, if anyone actually is thinking about this, please make it a per-table option not a global GUC option.) regards, tom lane
Kevin, > > Josh already mentioned this as a problem with user-trigger-based > > counting. > > Right, but the trigger based mechanism probably magnifies the issue by > orders of magnitude, and thus can't necessarily be used as an argument > against an internally-implemented method. I'm not sure about that, Kevin. The production trigger test was written in C (by Joe Conway), using some of the best memory/efficiency management he could devise. I could buy that the trigger mechanism adds a certain fixed overhead to the process, but not the contention that we were seeing ... especially not the geometric progression of inefficiency as the transaction count went up. We'll talk about this offlist; I may be able to get the client to authorize letting you examine the database. For further detail, our setup was sort of a "destruction test"; including: 1) a slightly underpowered server running too many processes; 2) a very high disk contention environment, with multiple applications fighting for I/O. 3) running COUNT(*), GROUP BY x on a table with 1.4 million rows, which was being updated in batches of 10,000 rows to 40,000 rows every few minutes. As I said before, the overhead for c-trigger based accounting, within the MVCC framework, was quite tolerable with small update batches, only 9-11% penalty to the updates overall for batches of 100-300 updates. However, as we increased the application activity, the update penalty increased, up to 40-45% with the full production load. It's not hard to figure out why; like most user's servers, the aggregate caching table was on the same disk as the table(s) being updated. The resut was a huge amount of disk-head-skipping between the updated table and the aggregate caching table every time a commit hit the database, with random seek times increasing the longer the time since the last VACUUM. Now, on a better server with these tables on fast RAID or on different spindles, I expect the result would be somewhat better. However, I also suspect that many of the users who complain the loudest about slow count(*) are operating in single-spindle environments. -- Josh Berkus Aglio Database Solutions San Francisco
On Sat, Apr 19, 2003 at 12:03:18PM -0700, Josh Berkus wrote: > Kevin, Tom: > > > (The cheapness can be disputed as well, since it creates a single point > > of contention for all inserts and deletes on the table. But that's a > > different topic.) > > Actually, this was the problem with the trigger method of maintaining COUNT > information in PostgreSQL. The statistics table itself becomes a > significant souce of delay, since if a table_A gets 10,000 rows updated than > table_count_A must necessarily be updated 10,000 times ... creating a lot of > dead tuples and severely attenuating the table on disk until the next vacuum > ... resulting in Update #10,000 to table_count_A taking 100+ times as long as > Update #1 does, due to the required random seek time on disk. Once statement level triggers are implimented, the performance would probably be fine, assuming your update was a single statement. -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
> Once statement level triggers are implimented, the performance would > probably be fine, assuming your update was a single statement. Statement triggers _are_ implemented in CVS. Chris
Tom Lane <tgl@sss.pgh.pa.us> writes: > AFAICS, central-counter techniques could only work in an MVCC system > if each transaction copies every counter in the system at each snapshot > freeze point, in case it finds itself needing that counter value later > on. This is a huge amount of mostly-useless overhead, and it makes the > problem of lock contention for access to the counters several orders of > magnitude worse than you'd first think. Well, one option would be to do it in a lazy way. If you do an update on a table with cached aggregate data just throw the data out. This way you get to cache data on infrequently updated tables and get only a very small penalty on frequently updated tables. -- greg