Thread: "select count(*) from contacts" is too slow!
Why does 'select count(id) from "tblContacts"' do a sequential scan when the field 'id' is indexed using a btree?
MySql simply looks at the index which is keeping a handy record of the number of rows.
Can anybody explain how and why postgres does this query like it does?
Many thanks
Paul
MySql simply looks at the index which is keeping a handy record of the number of rows.
Can anybody explain how and why postgres does this query like it does?
Many thanks
Paul
On Tue, 7 Oct 2003, Paul Serby wrote: > Why does '*select count(id) from "tblContacts"'* do a sequential scan > when the field '*id*' is indexed using a btree? > > MySql simply looks at the index which is keeping a handy record of the > number of rows. > > Can anybody explain how and why postgres does this query like it does? It's a FAQ I believe. MySQL can tell you from it's index because it doesn't care if it gives you the right number or not. -- Nigel Andrews
On Tue, 7 Oct 2003, Paul Serby wrote: > Why does '*select count(id) from "tblContacts"'* do a sequential scan > when the field '*id*' is indexed using a btree? > > MySql simply looks at the index which is keeping a handy record of the > number of rows. > > Can anybody explain how and why postgres does this query like it does? Because the index doesn't contain enough information to determine if a particular row is visible to your transaction or not. It would have to go read the table to find that out, at which point using the index doesn't help. There's been a recent discussion of this on one of the lists (either -general or -performance I'd guess) that you might want to look up in the archives.
> MySQL can tell you from it's index because it doesn't care if it gives you the > right number or not. Under what circumstances would MySQL give the wrong number? -- Mike Nolan
A long time ago, in a galaxy far, far away, paul.serby@clockltd.com (Paul Serby) wrote: > Why does 'select count(id) from "tblContacts"' do a sequential scan > when the field 'id' is indexed using a btree? MySql simply looks at > the index which is keeping a handy record of the number of rows. > Can anybody explain how and why postgres does this query like it > does? Look into the semantics of MVCC (MultiVersion Concurrency Control); that (otherwise useful) feature prevents having any such "handy record." -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://www.ntlug.org/~cbbrowne/spreadsheets.html Developmental Psychology "Schoolyard behavior resembles adult primate behavior because "Ontogeny Recapitulates Phylogeny" doesn't stop at birth." -- Mark Miller
A long time ago, in a galaxy far, far away, nolan@celery.tssi.com wrote: >> MySQL can tell you from it's index because it doesn't care if it gives you the >> right number or not. > > Under what circumstances would MySQL give the wrong number? It would give the wrong number under _every_ circumstance where there are uncommitted INSERTs or DELETEs. -- select 'cbbrowne' || '@' || 'ntlug.org'; http://www3.sympatico.ca/cbbrowne/sap.html Appendium to the Rules of the Evil Overlord #1: "I will not build excessively integrated security-and-HVAC systems. They may be Really Cool, but are far too vulnerable to breakdowns."
Christopher Browne wrote: > A long time ago, in a galaxy far, far away, nolan@celery.tssi.com wrote: > >>>MySQL can tell you from it's index because it doesn't care if it gives you the >>>right number or not. >> >>Under what circumstances would MySQL give the wrong number? > > > It would give the wrong number under _every_ circumstance where there > are uncommitted INSERTs or DELETEs. Give them some credit. I just double checked: Using mysql 4.0.14 + innodb and transactions, select count(*) from foo; does not count uncommited INSERTs. Heck, even using myisam, mysql's count(*)'s still accurate, since all INSERTs, etc are autocommitted. -- Linux homer 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 GNU/Linux 12:00pm up 287 days, 3:33, 7 users, load average: 6.93, 6.31, 6.16
Attachment
Ang Chin Han <angch@bytecraft.com.my> writes: > Heck, even using myisam, mysql's count(*)'s still accurate, since all INSERTs, > etc are autocommitted. That's sort of true, but not the whole story. Even autocommitted transactions can be pending for a significant amount of time. The reason it's accurate is because with mysql isam tables all updates take a table level lock. So there's never a chance to select the count while an uncommitted transaction is pending, even if the update takes a long time. This is simple and efficient when you have low levels of concurrency. But when you have 4+ processors or transactions involving lots of disk i/o it kills scalability. I'm curious how it's implemented with innodb tables. Do they still take a table-level lock when committing to update the counters? What happens to transactions that have already started, do they see the new value? Actually it occurs to me that that might be ok for read-committed. Is there ever a situation where a count(*) needs to represent an old snapshot in read-committed? It has to for long-running selects, but if the count(*) itself is always fast that need should never arise, just shared-lock and read the value and unlock. In order words, imagine if you had every transaction keep a net delta of rows for every table and at commit time locked the entire table and updated the count. The lock would be a point of contention but it would be very fast since it would only have to update an integer with a precalculated adjustment. In read-committed mode that would always be a valid value. (The transaction would have to apply its own deltas I guess.) -- greg