Thread: Most efficient report of number of records in all tables?
All, I can find the names of all tables in the database with this query: SELECT table_name FROM information_schema.tables WHERE table_type = 'BASE TABLE' AND table_schema NOT IN ('pg_catalog', 'information_schema') ORDER BY table_name ASC; Then, in code, I can loop through all the table names and run the following query: SELECT COUNT(*) AS result FROM $table; But, this can be slow when I have a large number of tables of some tables have several million rows. Is there a faster way to get this data using table statistics or something like that? Perhaps something in a single query? -- Dante
Currently "ALTER TABLE ... RENAME TO ..." results in all views that refer to the table to be rewritten with the new table name. This is a good thing in the general case, but there are also situations where it is not (e.g. temporarily renaming tables for data reorg reasons). I can't seem to find a clean way to only rename the table without causing change to the view. The ONLY keyword does not work in this case. Anything I am missing (short of re-creating all views). I am on 8.1. test=> create table a (col int); CREATE TABLE test=> create view v_a as select col from a; CREATE VIEW test=> \d v_a View "public.v_a" Column | Type | Modifiers --------+---------+----------- col | integer | View definition: SELECT a.col FROM a; test=> alter table a rename to b; ALTER TABLE test=> \d v_a View "public.v_a" Column | Type | Modifiers --------+---------+----------- col | integer | View definition: SELECT a.col FROM b a;
If you only need a cardinality estimate, then pg_class.reltuples may be of help (it will be accurate to when the last vacuum was performed). If you need exact counts then there are a couple of problems: 1. An MVCC database cannot store an exact count, because it can differ by user. Hence, to collect the exact number, a table scan is necessary. 2. The number can be invalid immediately after the query and might be different for different users anyway. What are you doing with those numbers? > -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of D. Dante Lorenso > Sent: Monday, February 26, 2007 2:20 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Most efficient report of number of records in all > tables? > > All, > > I can find the names of all tables in the database with this query: > > SELECT table_name > FROM information_schema.tables > WHERE table_type = 'BASE TABLE' > AND table_schema NOT IN ('pg_catalog', 'information_schema') > ORDER BY table_name ASC; > > > Then, in code, I can loop through all the table names and run the > following query: > > SELECT COUNT(*) AS result > FROM $table; > > > But, this can be slow when I have a large number of tables of some > tables have several million rows. > > Is there a faster way to get this data using table statistics or > something like that? Perhaps something in a single query? > > -- Dante > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend
Dann Corbit wrote: > If you only need a cardinality estimate, then pg_class.reltuples may be > of help (it will be accurate to when the last vacuum was performed). > Last vacuum ... how does that work with autovacuum? > If you need exact counts then there are a couple of problems: > 1. An MVCC database cannot store an exact count, because it can differ > by user. Hence, to collect the exact number, a table scan is necessary. > A table scan ... ouch? I just assumed that COUNT(*) FROM table_name would be a fast query internally. I see what you mean about MVCC, though. > 2. The number can be invalid immediately after the query and might be > different for different users anyway. > The numbers don't really need to be 100% accurate (it's just a ballpark stat). > What are you doing with those numbers? > It's just an administrative report showing patterns of growth in our database storage. We are trying to keep statistics for users and our stats tables are generating about 50,000 records daily. We only know this is true because we have this reports which shows table record counts daily. -- Dante > >> -----Original Message----- >> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- >> owner@postgresql.org] On Behalf Of D. Dante Lorenso >> Sent: Monday, February 26, 2007 2:20 PM >> To: pgsql-general@postgresql.org >> Subject: [GENERAL] Most efficient report of number of records in all >> tables? >> >> All, >> >> I can find the names of all tables in the database with this query: >> >> SELECT table_name >> FROM information_schema.tables >> WHERE table_type = 'BASE TABLE' >> AND table_schema NOT IN ('pg_catalog', 'information_schema') >> ORDER BY table_name ASC; >> >> >> Then, in code, I can loop through all the table names and run the >> following query: >> >> SELECT COUNT(*) AS result >> FROM $table; >> >> >> But, this can be slow when I have a large number of tables of some >> tables have several million rows. >> >> Is there a faster way to get this data using table statistics or >> something like that? Perhaps something in a single query? >> >> -- Dante >> >> >> >> >> ---------------------------(end of >> > broadcast)--------------------------- > >> TIP 6: explain analyze is your friend >> > >
George Pavlov wrote: > Currently "ALTER TABLE ... RENAME TO ..." results in all views that > refer to the table to be rewritten with the new table name. This is a > good thing in the general case, but there are also situations where it > is not (e.g. temporarily renaming tables for data reorg reasons). I > can't seem to find a clean way to only rename the table without causing > change to the view. The ONLY keyword does not work in this case. > Anything I am missing (short of re-creating all views). I am on 8.1. It'll either rename in all views, or not happen at all. Otherwise, there'd be nothing to stop you replacing the table with one that has an incompatible definition. Is this something you do regularly, and if so what are you trying to achieve? -- Richard Huxton Archonet Ltd
D. Dante Lorenso wrote: > Dann Corbit wrote: > >If you only need a cardinality estimate, then pg_class.reltuples may be > >of help (it will be accurate to when the last vacuum was performed). > > Last vacuum ... how does that work with autovacuum? The same, only that you'd have to monitor autovac activity to know for sure how far back it is :-) However, note that the optimizer uses the following trick to guesstimate the number of tuples on any given table: first, it asks the kernel for the number of blocks in the files corresponding to the table. Then, it uses the pg_class.reltuples and relpages values to estimate a "tuple density" (tuples per page), then multiplies by the number of blocks. As far as estimates go, this one is pretty good. The only thing I'm not sure how to get from SQL, is the actual number of blocks. You could trivially build a C function to get it. In fact, contrib/pgstatindex in CVS head (not sure if it's in 8.2) contains such a function which you could borrow, pg_relpages(). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On 2/27/07, D. Dante Lorenso <dante@lorenso.com> wrote: > Dann Corbit wrote: > > If you only need a cardinality estimate, then pg_class.reltuples may be > > of help (it will be accurate to when the last vacuum was performed). > > > > Last vacuum ... how does that work with autovacuum? 'analyze' updates pg_class.reltuples also. It is also cheaper than vacuum...you can force a fresh one by doing an analyze before you do your sweep. merlin
"George Pavlov" <gpavlov@mynewplace.com> writes: > Currently "ALTER TABLE ... RENAME TO ..." results in all views that > refer to the table to be rewritten with the new table name. They are not "rewritten". Views refer to tables by OID, and are therefore entirely insensitive to RENAME operations. This is not something we're likely to change. regards, tom lane