Thread: all empty tables
How could I list all the tables in a database that do not contain any data? I have looked at reltuples but can't quite work out how to use it, any pointers would be much apreciated. Regards Garry
On lör, 2009-11-14 at 10:12 +0000, Garry Saddington wrote: > How could I list all the tables in a database that do not contain any data? > I have looked at reltuples but can't quite work out how to use it, any > pointers would be much apreciated. select * from pg_class where relkind = 'r' and reltuples = 0; ? Of course reltuples is only a statistic, not an up to date value, but the above should be a pretty helpful start to detect possibly useless tables.
Garry Saddington <garry@schoolteachers.co.uk> wrote: > How could I list all the tables in a database that do not contain any data? > I have looked at reltuples but can't quite work out how to use it, any > pointers would be much apreciated. reltuples contains only an estimation, if you need the real values you need to look at every table. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On 14/11/2009 6:12 PM, Garry Saddington wrote: > How could I list all the tables in a database that do not contain any data? > I have looked at reltuples but can't quite work out how to use it, any > pointers would be much apreciated. Define "empty". In a MVCC database, it's harder than you'd think. A table of zero on-disk size is definitely empty. So is a table where no tuples are visible to any currently running or future transactions. Anything else is arguable. What if the tuples have all been DELETEd by a transaction that's committed, but one or more statements (or SERIALIZABLE transactions) are running that are working with the state of the database as it was before the DELETE committed? Is the table then empty? The best answer I can suggest is that the following statements, run as the super user while no other users are connected, will list the names of tables that are completely empty: -- First clear out dead tuples and truncate any tables to the position -- of the last live tuple: VACUUM; -- Then find any zero size tables that result: SELECT relname FROM pg_catalog.pg_class WHERE relpages = 0 AND NOT relisshared AND NOT relhassubclass AND relkind = 'r' AND relnamespace <> (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog'); By "no tuples visible to any currently running transaction" I mean one where all tuples have been deleted by a transaction that committed before any currently running transactions started. If you VACUUM such a table, it is truncated to zero size. With only one session running in the database, for example, we can create a dummy table then delete all records from it and see that it's still non-zero size even though no transactions that can "see" the deleted data are still running. When we VACUUM it, though, the dead tuples are marked and the table is truncated to the position of the last "live" tuple. As there aren't any live tuples, it gets truncated to zero size: # CREATE TABLE test AS SELECT * FROM generate_series(0,1000) AS x; # SELECT pg_relation_size('test'::regclass); 32768 # DELETE FROM test; # SELECT current_query, procpid FROM pg_stat_activity WHERE procpid <> pg_backend_pid(); (0 rows returned - no currently running transaction can see the data we just deleted.) # SELECT pg_relation_size('test'::regclass); 32768 # VACUUM test; # SELECT pg_relation_size('test'::regclass); 0 -- Craig Ringer
On Sat, Nov 14, 2009 at 6:01 AM, Craig Ringer <craig@postnewspapers.com.au> wrote: > On 14/11/2009 6:12 PM, Garry Saddington wrote: >> How could I list all the tables in a database that do not contain any data? >> I have looked at reltuples but can't quite work out how to use it, any >> pointers would be much apreciated. > > Define "empty". In a MVCC database, it's harder than you'd think. > > A table of zero on-disk size is definitely empty. So is a table where no > tuples are visible to any currently running or future transactions. > Anything else is arguable. What if the tuples have all been DELETEd by a > transaction that's committed, but one or more statements (or > SERIALIZABLE transactions) are running that are working with the state > of the database as it was before the DELETE committed? Is the table then > empty? > > > > The best answer I can suggest is that the following statements, run as > the super user while no other users are connected, will list the names > of tables that are completely empty: > > -- First clear out dead tuples and truncate any tables to the position > -- of the last live tuple: > VACUUM; > -- Then find any zero size tables that result: > SELECT relname FROM pg_catalog.pg_class WHERE relpages = 0 > AND NOT relisshared AND NOT relhassubclass AND relkind = 'r' > AND relnamespace <> > (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog'); > > > > > By "no tuples visible to any currently running transaction" I mean one > where all tuples have been deleted by a transaction that committed > before any currently running transactions started. If you VACUUM such a > table, it is truncated to zero size. > > With only one session running in the database, for example, we can > create a dummy table then delete all records from it and see that it's > still non-zero size even though no transactions that can "see" the > deleted data are still running. When we VACUUM it, though, the dead > tuples are marked and the table is truncated to the position of the last > "live" tuple. As there aren't any live tuples, it gets truncated to zero > size: > > # CREATE TABLE test AS SELECT * FROM generate_series(0,1000) AS x; > > # SELECT pg_relation_size('test'::regclass); > 32768 > > # DELETE FROM test; > > # SELECT current_query, procpid FROM pg_stat_activity > WHERE procpid <> pg_backend_pid(); > (0 rows returned - no currently running transaction can see the data we > just deleted.) > > # SELECT pg_relation_size('test'::regclass); > 32768 > > # VACUUM test; > > # SELECT pg_relation_size('test'::regclass); > 0 If all you care about is if a table has no rows visible to the current transaction, wouldn't: select coalesce((select 1 from the_table limit 1), 0)::bool as not_empty; be an efficient way to do it (hooking into your idea to get the list of tables to check above)? merlin
On 15/11/2009 5:04 AM, Merlin Moncure wrote: > On Sat, Nov 14, 2009 at 6:01 AM, Craig Ringer > <craig@postnewspapers.com.au> wrote: >> On 14/11/2009 6:12 PM, Garry Saddington wrote: >>> How could I list all the tables in a database that do not contain any data? >>> I have looked at reltuples but can't quite work out how to use it, any >>> pointers would be much apreciated. >> >> Define "empty". In a MVCC database, it's harder than you'd think. > If all you care about is if a table has no rows visible to the > current transaction, wouldn't: > select coalesce((select 1 from the_table limit 1), 0)::bool as not_empty; > > be an efficient way to do it (hooking into your idea to get the list > of tables to check above)? Yep. You could wrap something like that up in a PL/PgSQL function that looped over the list of table names obtained from pg_class (with schema names from pg_namespace) and EXECUTEd a query built for each one. -- Craig Ringer