Thread: Most efficient report of number of records in all tables?

Most efficient report of number of records in all tables?

From
"D. Dante Lorenso"
Date:
All,

I can find the names of all tables in the database with this query:

    SELECT table_name
    FROM information_schema.tables
    WHERE table_type = 'BASE TABLE'
    AND table_schema NOT IN ('pg_catalog', 'information_schema')
    ORDER BY table_name ASC;


Then, in code, I can loop through all the table names and run the
following query:

    SELECT COUNT(*) AS result
    FROM $table;


But, this can be slow when I have a large number of tables of some
tables have several million rows.

Is there a faster way to get this data using table statistics or
something like that?  Perhaps something in a single query?

-- Dante




preventing ALTER TABLE RENAME from changing view definitions?

From
"George Pavlov"
Date:
Currently "ALTER TABLE ... RENAME TO ..." results in all views that
refer to the table to be rewritten with the new table name. This is a
good thing in the general case, but there are also situations where it
is not (e.g. temporarily renaming tables for data reorg reasons). I
can't seem to find a clean way to only rename the table without causing
change to the view. The ONLY keyword does not work in this case.
Anything I am missing (short of re-creating all views). I am on 8.1.

test=> create table a (col int);
CREATE TABLE
test=> create view v_a as select col from a;
CREATE VIEW
test=> \d v_a
      View "public.v_a"
 Column |  Type   | Modifiers
--------+---------+-----------
 col    | integer |
View definition:
 SELECT a.col
   FROM a;

test=> alter table a rename to b;
ALTER TABLE
test=> \d v_a
      View "public.v_a"
 Column |  Type   | Modifiers
--------+---------+-----------
 col    | integer |
View definition:
 SELECT a.col
   FROM b a;

Re: Most efficient report of number of records in all tables?

From
"Dann Corbit"
Date:
If you only need a cardinality estimate, then pg_class.reltuples may be
of help (it will be accurate to when the last vacuum was performed).

If you need exact counts then there are a couple of problems:
1.  An MVCC database cannot store an exact count, because it can differ
by user.  Hence, to collect the exact number, a table scan is necessary.
2.  The number can be invalid immediately after the query and might be
different for different users anyway.

What are you doing with those numbers?

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of D. Dante Lorenso
> Sent: Monday, February 26, 2007 2:20 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Most efficient report of number of records in all
> tables?
>
> All,
>
> I can find the names of all tables in the database with this query:
>
>     SELECT table_name
>     FROM information_schema.tables
>     WHERE table_type = 'BASE TABLE'
>     AND table_schema NOT IN ('pg_catalog', 'information_schema')
>     ORDER BY table_name ASC;
>
>
> Then, in code, I can loop through all the table names and run the
> following query:
>
>     SELECT COUNT(*) AS result
>     FROM $table;
>
>
> But, this can be slow when I have a large number of tables of some
> tables have several million rows.
>
> Is there a faster way to get this data using table statistics or
> something like that?  Perhaps something in a single query?
>
> -- Dante
>
>
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 6: explain analyze is your friend

Re: Most efficient report of number of records in all tables?

From
"D. Dante Lorenso"
Date:
Dann Corbit wrote:
> If you only need a cardinality estimate, then pg_class.reltuples may be
> of help (it will be accurate to when the last vacuum was performed).
>

Last vacuum ... how does that work with autovacuum?

> If you need exact counts then there are a couple of problems:
> 1.  An MVCC database cannot store an exact count, because it can differ
> by user.  Hence, to collect the exact number, a table scan is necessary.
>
A table scan ... ouch?  I just assumed that COUNT(*) FROM table_name
would be a fast query internally.  I see what you mean about MVCC, though.

> 2.  The number can be invalid immediately after the query and might be
> different for different users anyway.
>
The numbers don't really need to be 100% accurate (it's just a ballpark
stat).

> What are you doing with those numbers?
>

It's just an administrative report showing patterns of growth in our
database storage.  We are trying to keep statistics for users and our
stats tables are generating about 50,000 records daily.  We only know
this is true because we have this reports which shows table record
counts daily.

-- Dante

>
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>> owner@postgresql.org] On Behalf Of D. Dante Lorenso
>> Sent: Monday, February 26, 2007 2:20 PM
>> To: pgsql-general@postgresql.org
>> Subject: [GENERAL] Most efficient report of number of records in all
>> tables?
>>
>> All,
>>
>> I can find the names of all tables in the database with this query:
>>
>>     SELECT table_name
>>     FROM information_schema.tables
>>     WHERE table_type = 'BASE TABLE'
>>     AND table_schema NOT IN ('pg_catalog', 'information_schema')
>>     ORDER BY table_name ASC;
>>
>>
>> Then, in code, I can loop through all the table names and run the
>> following query:
>>
>>     SELECT COUNT(*) AS result
>>     FROM $table;
>>
>>
>> But, this can be slow when I have a large number of tables of some
>> tables have several million rows.
>>
>> Is there a faster way to get this data using table statistics or
>> something like that?  Perhaps something in a single query?
>>
>> -- Dante
>>
>>
>>
>>
>> ---------------------------(end of
>>
> broadcast)---------------------------
>
>> TIP 6: explain analyze is your friend
>>
>
>



Re: preventing ALTER TABLE RENAME from changing view definitions?

From
Richard Huxton
Date:
George Pavlov wrote:
> Currently "ALTER TABLE ... RENAME TO ..." results in all views that
> refer to the table to be rewritten with the new table name. This is a
> good thing in the general case, but there are also situations where it
> is not (e.g. temporarily renaming tables for data reorg reasons). I
> can't seem to find a clean way to only rename the table without causing
> change to the view. The ONLY keyword does not work in this case.
> Anything I am missing (short of re-creating all views). I am on 8.1.

It'll either rename in all views, or not happen at all. Otherwise,
there'd be nothing to stop you replacing the table with one that has an
incompatible definition.

Is this something you do regularly, and if so what are you trying to
achieve?

--
   Richard Huxton
   Archonet Ltd

Re: Most efficient report of number of records in all tables?

From
Alvaro Herrera
Date:
D. Dante Lorenso wrote:
> Dann Corbit wrote:
> >If you only need a cardinality estimate, then pg_class.reltuples may be
> >of help (it will be accurate to when the last vacuum was performed).
>
> Last vacuum ... how does that work with autovacuum?

The same, only that you'd have to monitor autovac activity to know for
sure how far back it is :-)

However, note that the optimizer uses the following trick to guesstimate
the number of tuples on any given table: first, it asks the kernel for
the number of blocks in the files corresponding to the table.  Then, it
uses the pg_class.reltuples and relpages values to estimate a "tuple
density" (tuples per page), then multiplies by the number of blocks.  As
far as estimates go, this one is pretty good.

The only thing I'm not sure how to get from SQL, is the actual number of
blocks.  You could trivially build a C function to get it.  In fact,
contrib/pgstatindex in CVS head (not sure if it's in 8.2) contains such
a function which you could borrow, pg_relpages().

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Most efficient report of number of records in all tables?

From
"Merlin Moncure"
Date:
On 2/27/07, D. Dante Lorenso <dante@lorenso.com> wrote:
> Dann Corbit wrote:
> > If you only need a cardinality estimate, then pg_class.reltuples may be
> > of help (it will be accurate to when the last vacuum was performed).
> >
>
> Last vacuum ... how does that work with autovacuum?

'analyze' updates pg_class.reltuples also.  It is also cheaper than
vacuum...you can force a fresh one by doing an analyze before you do
your sweep.

merlin

Re: preventing ALTER TABLE RENAME from changing view definitions?

From
Tom Lane
Date:
"George Pavlov" <gpavlov@mynewplace.com> writes:
> Currently "ALTER TABLE ... RENAME TO ..." results in all views that
> refer to the table to be rewritten with the new table name.

They are not "rewritten".  Views refer to tables by OID, and are
therefore entirely insensitive to RENAME operations.  This is not
something we're likely to change.

            regards, tom lane