Thread: PL/PgSQL for counting all rows in all tables.

PL/PgSQL for counting all rows in all tables.

From
David Fetter
Date:
Folks,

I've noticed that when coming into an organization, I need to do some
counting on what's in all the tables in a db.  This shortens that
process considerably, with the usual caveat that count(*) is a heavy
operation.

By the way, the 3 lines following "godawful hack" point to something
PL/PgSQL ought (imho) to be able to do, namely something like

EXECUTE INTO [ record | rowtype | type ] [sql TEXT string returning a single row];

Here 'tis: version 0.01...

CREATE TYPE table_count AS (table_name TEXT, num_rows INTEGER);

CREATE OR REPLACE FUNCTION count_em_all () RETURNS SETOF table_count
AS '
DECLARE   the_count RECORD;   t_name RECORD;   r table_count%ROWTYPE;
BEGIN   FOR t_name IN       SELECT c.relname       FROM           pg_catalog.pg_class c               LEFT JOIN
 pg_namespace n               ON           n.oid = c.relnamespace       WHERE           c.relkind = ''r''
AND          n.nspname = ''public''       ORDER BY 1   LOOP       -- The next 3 lines are a godawful hack. :P       FOR
the_countIN EXECUTE ''SELECT COUNT(*) AS "count" FROM '' || t_name.relname       LOOP       END LOOP;
r.table_name:= t_name.relname;       r.num_rows := the_count.count;       RETURN NEXT r;   END LOOP;   RETURN;
 
END;
' LANGUAGE plpgsql;

COMMENT ON FUNCTION count_em_all () IS 'Spits out all tables in the public schema and the exact row counts for each.';

-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!


Re: PL/PgSQL for counting all rows in all tables.

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
My version:
ANALYZE;
SELECT n.nspname, relname, reltuples
FROM pg_class c, pg_namespace n
WHERE c.relnamespace=n.oid
AND relkind='r'
AND NOT n.nspname ~ '^pg_'
ORDER BY 1,2;
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200409282138
-----BEGIN PGP SIGNATURE-----
iD8DBQFBWhIivJuQZxSWSsgRAm3cAJwLySzR5mpOHDT06LN6vj9M5zyyOQCgt4j1
gWDPcXJoUZbTlZvxxrE7fcY=
=MOMs
-----END PGP SIGNATURE-----




Re: PL/PgSQL for counting all rows in all tables.

From
Ian FREISLICH
Date:
"Greg Sabino Mullane" wrote:
> ANALYZE;
>  
> SELECT n.nspname, relname, reltuples
> FROM pg_class c, pg_namespace n
> WHERE c.relnamespace=n.oid
> AND relkind='r'
> AND NOT n.nspname ~ '^pg_'
> ORDER BY 1,2;

Maybe this gem should be passed onto the pgadmin folks.  When you
click on a table name in the interface it does what I can only
presume is a count(*) from relation, which takes forever on enormous
relations.  It then claims this to be a row estimate anyway, so
they could probably drop the analyze.

Ian

--
Ian Freislich


Re: PL/PgSQL for counting all rows in all tables.

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Ian FREISLICH
> Sent: 05 October 2004 09:57
> To: Greg Sabino Mullane
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] PL/PgSQL for counting all rows in all tables.
>
> "Greg Sabino Mullane" wrote:
> > ANALYZE;
> >
> > SELECT n.nspname, relname, reltuples
> > FROM pg_class c, pg_namespace n
> > WHERE c.relnamespace=n.oid
> > AND relkind='r'
> > AND NOT n.nspname ~ '^pg_'
> > ORDER BY 1,2;
>
> Maybe this gem should be passed onto the pgadmin folks.  When
> you click on a table name in the interface it does what I can
> only presume is a count(*) from relation, which takes forever
> on enormous relations.  It then claims this to be a row
> estimate anyway, so they could probably drop the analyze.

The 'Rows (counted)' value is taken from a count(*), but only if the
'Rows (estimated)' value (which comes from pg_class.reltuples, as above,
but without the costly analyze) is less than the cut-off value set in
the options dialogue. So, if you never want to wait for the exact row
count, just set the appropriate option to zero.

Regards Dave



Re: PL/PgSQL for counting all rows in all tables.

From
Robert Treat
Date:
On Tue, 2004-10-05 at 05:09, Dave Page wrote:
>  
> 
> > -----Original Message-----
> > From: pgsql-hackers-owner@postgresql.org 
> > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Ian FREISLICH
> > Sent: 05 October 2004 09:57
> > To: Greg Sabino Mullane
> > Cc: pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] PL/PgSQL for counting all rows in all tables. 
> > 
> > "Greg Sabino Mullane" wrote:
> > > ANALYZE;
> > >  
> > > SELECT n.nspname, relname, reltuples
> > > FROM pg_class c, pg_namespace n
> > > WHERE c.relnamespace=n.oid
> > > AND relkind='r'
> > > AND NOT n.nspname ~ '^pg_'
> > > ORDER BY 1,2;
> > 
> > Maybe this gem should be passed onto the pgadmin folks.  When 
> > you click on a table name in the interface it does what I can 
> > only presume is a count(*) from relation, which takes forever 
> > on enormous relations.  It then claims this to be a row 
> > estimate anyway, so they could probably drop the analyze.
> 
> The 'Rows (counted)' value is taken from a count(*), but only if the
> 'Rows (estimated)' value (which comes from pg_class.reltuples, as above,
> but without the costly analyze) is less than the cut-off value set in
> the options dialogue. So, if you never want to wait for the exact row
> count, just set the appropriate option to zero.
> 

How do you handle table growth that makes the reltuples value out of
whack since the last analyze?  Seems unfortunate that people may not
realize that the numbers they are looking at are incorrect but I don't
see much way to avoid it. 

Seems new tables would have that problem too since they would default to
1000... do you analyze after table creation?


Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: PL/PgSQL for counting all rows in all tables.

From
"Dave Page"
Date:

> -----Original Message-----
> From: Robert Treat [mailto:xzilla@users.sourceforge.net]
> Sent: 11 October 2004 22:30
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] PL/PgSQL for counting all rows in all tables.
>
> How do you handle table growth that makes the reltuples value
> out of whack since the last analyze?  Seems unfortunate that
> people may not realize that the numbers they are looking at
> are incorrect but I don't see much way to avoid it.

Right-click the table object and select 'Count' on the current versions.
Previously, iirc it showed the message 'Refresh table to count' in the
actual count field, so you did a right-click -> Refresh.

> Seems new tables would have that problem too since they would
> default to 1000... do you analyze after table creation?

No. 1000 is below the default threshold, so it'll get counted. If you
lower the threshold, it will say 'not counted' in the actual count
field.

Regards, Dave.


Re: PL/PgSQL for counting all rows in all tables.

From
Robert Treat
Date:
On Tuesday 12 October 2004 03:22, Dave Page wrote:
> > -----Original Message-----
> > From: Robert Treat [mailto:xzilla@users.sourceforge.net]
> > Sent: 11 October 2004 22:30
> > To: Dave Page
> > Cc: pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] PL/PgSQL for counting all rows in all tables.
> >
> > How do you handle table growth that makes the reltuples value
> > out of whack since the last analyze?  Seems unfortunate that
> > people may not realize that the numbers they are looking at
> > are incorrect but I don't see much way to avoid it.
>
> Right-click the table object and select 'Count' on the current versions.
> Previously, iirc it showed the message 'Refresh table to count' in the
> actual count field, so you did a right-click -> Refresh.
>

Maybe I didn't phrase that quite right. How would a user know that he needs to 
do a real count?  For example, if I have a table with est 1 million rows, and 
I load another 1 million rows into it, wont pgadmin show me 1 million rows 
until I run an analyze? Even if I run a manual count, wont it show 1 million 
next time I come into the application, and that time I may not realize that 
the table is off by 1 million rows so I take the estimated count at face 
value.  

BTW The reason I'm asking about this is we're trying to come up with a good 
scheme for phppgadmin to show estimated counts without showing incorrect 
numbers to users... or at least giving them a clue that the numbers might be 
really off. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: PL/PgSQL for counting all rows in all tables.

From
Andreas Pflug
Date:
Robert Treat wrote:

>>
>>Right-click the table object and select 'Count' on the current versions.
>>Previously, iirc it showed the message 'Refresh table to count' in the
>>actual count field, so you did a right-click -> Refresh.
>>
> 
> 
> Maybe I didn't phrase that quite right. How would a user know that he needs to 
> do a real count?  For example, if I have a table with est 1 million rows, and 
> I load another 1 million rows into it, wont pgadmin show me 1 million rows 
> until I run an analyze? Even if I run a manual count, wont it show 1 million 
> next time I come into the application, and that time I may not realize that 
> the table is off by 1 million rows so I take the estimated count at face 
> value.  

pgAdmin will either show the true row count from a SELECT COUNT(*), or 
"not counted". The last rowcount from vacuum is named "estimated row 
count", so it's clearly distinguishable which number is counted and 
which is estimated.

Regards,
Andreas




Re: PL/PgSQL for counting all rows in all tables.

From
Rod Taylor
Date:
> BTW The reason I'm asking about this is we're trying to come up with a good 
> scheme for phppgadmin to show estimated counts without showing incorrect 
> numbers to users... or at least giving them a clue that the numbers might be 
> really off. 

The vacuum daemon should know how far off the count might be. You can
display 100 +- 15%, where the 100 is collected by ANALYZE and the 15% is
determined based on the vacuum daemon estimation.

Of course, AVD needs to be implemented and the interface created (a
pg_count view?)...



Re: PL/PgSQL for counting all rows in all tables.

From
"Dave Page"
Date:

> -----Original Message-----
> From: Robert Treat [mailto:xzilla@users.sourceforge.net]
> Sent: 12 October 2004 21:21
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] PL/PgSQL for counting all rows in all tables.
>
> Maybe I didn't phrase that quite right. How would a user know
> that he needs to do a real count?  For example, if I have a
> table with est 1 million rows, and I load another 1 million
> rows into it, wont pgadmin show me 1 million rows until I run
> an analyze? Even if I run a manual count, wont it show 1
> million next time I come into the application, and that time
> I may not realize that the table is off by 1 million rows so
> I take the estimated count at face value.

Yeah, that's exactly what can happen. If it makes you feel any better
about doing it in phppgadmin, then we never got any complaints about it!

We simply rely on the fact that it's labelled as an estimate. I can't
see that there's any other way around it that doesn't require running a
potentially very expensive ANALYZE to update the stats.

Regards, Dave