Thread: all empty tables

all empty tables

From
Garry Saddington
Date:
How could I list all the tables in a database that do not contain any data?
I have looked at reltuples but can't quite work out how to use it, any
pointers would be much apreciated.
Regards
Garry

Re: all empty tables

From
Peter Eisentraut
Date:
On lör, 2009-11-14 at 10:12 +0000, Garry Saddington wrote:
> How could I list all the tables in a database that do not contain any data?
> I have looked at reltuples but can't quite work out how to use it, any
> pointers would be much apreciated.

select * from pg_class where relkind = 'r' and reltuples = 0; ?

Of course reltuples is only a statistic, not an up to date value, but
the above should be a pretty helpful start to detect possibly useless
tables.


Re: all empty tables

From
Andreas Kretschmer
Date:
Garry Saddington <garry@schoolteachers.co.uk> wrote:

> How could I list all the tables in a database that do not contain any data?
> I have looked at reltuples but can't quite work out how to use it, any
> pointers would be much apreciated.

reltuples contains only an estimation, if you need the real values you
need to look at every table.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: all empty tables

From
Craig Ringer
Date:
On 14/11/2009 6:12 PM, Garry Saddington wrote:
> How could I list all the tables in a database that do not contain any data?
> I have looked at reltuples but can't quite work out how to use it, any
> pointers would be much apreciated.

Define "empty". In a MVCC database, it's harder than you'd think.

A table of zero on-disk size is definitely empty. So is a table where no
tuples are visible to any currently running or future transactions.
Anything else is arguable. What if the tuples have all been DELETEd by a
transaction that's committed, but one or more statements (or
SERIALIZABLE transactions) are running that are working with the state
of the database as it was before the DELETE committed? Is the table then
empty?



The best answer I can suggest is that the following statements, run as
the super user while no other users are connected, will list the names
of tables that are completely empty:

-- First clear out dead tuples and truncate any tables to the position
-- of the last live tuple:
VACUUM;
-- Then find any zero size tables that result:
SELECT relname FROM pg_catalog.pg_class WHERE relpages = 0
AND NOT relisshared AND NOT relhassubclass AND relkind = 'r'
AND relnamespace <>
    (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog');




By "no tuples visible to any currently running transaction" I mean one
where all tuples have been deleted by a transaction that committed
before any currently running transactions started. If you VACUUM such a
table, it is truncated to zero size.

With only one session running in the database, for example, we can
create a dummy table then delete all records from it and see that it's
still non-zero size even though no transactions that can "see" the
deleted data are still running. When we VACUUM it, though, the dead
tuples are marked and the table is truncated to the position of the last
"live" tuple. As there aren't any live tuples, it gets truncated to zero
size:

# CREATE TABLE test AS SELECT * FROM generate_series(0,1000) AS x;

# SELECT pg_relation_size('test'::regclass);
32768

# DELETE FROM test;

# SELECT current_query, procpid FROM pg_stat_activity
  WHERE procpid <> pg_backend_pid();
(0 rows returned - no currently running transaction can see the data we
just deleted.)

# SELECT pg_relation_size('test'::regclass);
32768

# VACUUM test;

# SELECT pg_relation_size('test'::regclass);
0



--
Craig Ringer

Re: all empty tables

From
Merlin Moncure
Date:
On Sat, Nov 14, 2009 at 6:01 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> On 14/11/2009 6:12 PM, Garry Saddington wrote:
>> How could I list all the tables in a database that do not contain any data?
>> I have looked at reltuples but can't quite work out how to use it, any
>> pointers would be much apreciated.
>
> Define "empty". In a MVCC database, it's harder than you'd think.
>
> A table of zero on-disk size is definitely empty. So is a table where no
> tuples are visible to any currently running or future transactions.
> Anything else is arguable. What if the tuples have all been DELETEd by a
> transaction that's committed, but one or more statements (or
> SERIALIZABLE transactions) are running that are working with the state
> of the database as it was before the DELETE committed? Is the table then
> empty?
>
>
>
> The best answer I can suggest is that the following statements, run as
> the super user while no other users are connected, will list the names
> of tables that are completely empty:
>
> -- First clear out dead tuples and truncate any tables to the position
> -- of the last live tuple:
> VACUUM;
> -- Then find any zero size tables that result:
> SELECT relname FROM pg_catalog.pg_class WHERE relpages = 0
> AND NOT relisshared AND NOT relhassubclass AND relkind = 'r'
> AND relnamespace <>
>    (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog');
>
>
>
>
> By "no tuples visible to any currently running transaction" I mean one
> where all tuples have been deleted by a transaction that committed
> before any currently running transactions started. If you VACUUM such a
> table, it is truncated to zero size.
>
> With only one session running in the database, for example, we can
> create a dummy table then delete all records from it and see that it's
> still non-zero size even though no transactions that can "see" the
> deleted data are still running. When we VACUUM it, though, the dead
> tuples are marked and the table is truncated to the position of the last
> "live" tuple. As there aren't any live tuples, it gets truncated to zero
> size:
>
> # CREATE TABLE test AS SELECT * FROM generate_series(0,1000) AS x;
>
> # SELECT pg_relation_size('test'::regclass);
> 32768
>
> # DELETE FROM test;
>
> # SELECT current_query, procpid FROM pg_stat_activity
>  WHERE procpid <> pg_backend_pid();
> (0 rows returned - no currently running transaction can see the data we
> just deleted.)
>
> # SELECT pg_relation_size('test'::regclass);
> 32768
>
> # VACUUM test;
>
> # SELECT pg_relation_size('test'::regclass);
> 0

If all you care about is if a  table has no rows visible to the
current transaction, wouldn't:
select coalesce((select 1 from the_table limit 1), 0)::bool as not_empty;

be an efficient way to do it (hooking into your idea to get the list
of tables to check above)?

merlin

Re: all empty tables

From
Craig Ringer
Date:
On 15/11/2009 5:04 AM, Merlin Moncure wrote:
> On Sat, Nov 14, 2009 at 6:01 AM, Craig Ringer
> <craig@postnewspapers.com.au> wrote:
>> On 14/11/2009 6:12 PM, Garry Saddington wrote:
>>> How could I list all the tables in a database that do not contain any data?
>>> I have looked at reltuples but can't quite work out how to use it, any
>>> pointers would be much apreciated.
>>
>> Define "empty". In a MVCC database, it's harder than you'd think.

> If all you care about is if a  table has no rows visible to the
> current transaction, wouldn't:
> select coalesce((select 1 from the_table limit 1), 0)::bool as not_empty;
>
> be an efficient way to do it (hooking into your idea to get the list
> of tables to check above)?

Yep. You could wrap something like that up in a PL/PgSQL function that
looped over the list of table names obtained from pg_class (with schema
names from pg_namespace) and EXECUTEd a query built for each one.

--
Craig Ringer