Re: Identifying old/unused views and table - Mailing list pgsql-general

From salah jubeh
Subject Re: Identifying old/unused views and table
Date
Msg-id 1317343029.84063.YahooMailNeo@web161513.mail.bf1.yahoo.com
Whole thread Raw
In response to Re: Identifying old/unused views and table  (Jason Long <mailing.lists@octgsoftware.com>)
List pgsql-general

Hello,

I had the same issue before and I used the  PostgreSQL statistics to see weather the tables are used or not. One thing that I could not solve is how to check if the schema design and  semantics are good. i.e. table a references table b,  table c references table b, and table  c reference table a. In some cases I find something like loops and circles in other cases, I find the same table referenced many times in other tables in the same schema. Any way,  here are my findings regarding how to clean up your data. 

1. Checking the number of sequential and indexed access to the table gives a good hint if the table is in use or deprecated.  The following select statement retrieve the tables that might be deprecated.
Select relname from
pg_stat_user_tables
WHERE (idx_tup_fetch + seq_tup_read)= 0; -- you can define a threshold here

2. Empty tables can be retrieved by checking the number of live tup i.e
Select relname from
pg_stat_user_tables
WHERE n_live_tup = 0;
3.  column can be checked using the null fraction in 
see http://www.postgresql.org/docs/8.3/static/view-pg-stats.html

4.  use  pg_constraints to determine the tables that depends on the above tables 

5. table duplicates i.e the table can be found in more than one schema 

SELECT
n.nspname as "Schema",
c.relname as "Name"
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname IN (SELECT relname FROM pg_catalog.pg_class
WHERE relkind IN ('r')
GROUP BY relname
Having count(relname) > 1)
ORDER BY 2,1;

6. For views there is no statistics, an easy way is to parse the log file using regular expressions and shell scripting and compare the result with the list of views and tables , I did that and I get many deprecated view. 

7. for duplicate data have a look on this query. 
-- if youneed table to find an exact duplicate replace <col1 >, ...  <coln > with table name
   SELECT       <col1 >, ...  <coln >,       min(ctid) AS keep,       count(*)   FROM <table>   GROUP BY  <col1 >, ...  <coln >   HAVING count(*) > 1

--- The above code snippet can be combined with delete statement to delete duplicate rows


Have fun 





From: Jason Long <mailing.lists@octgsoftware.com>
To: Guillaume Lelarge <guillaume@lelarge.info>
Cc: Craig Ringer <ringerc@ringerc.id.au>; pgsql-general@postgresql.org
Sent: Friday, September 30, 2011 12:12 AM
Subject: Re: [GENERAL] Identifying old/unused views and table

On Wed, 2011-09-28 at 08:52 +0200, Guillaume Lelarge wrote:
> On Wed, 2011-09-28 at 09:04 +0800, Craig Ringer wrote:
> > On 09/28/2011 04:51 AM, Jason Long wrote:
> > > I have an application with a couple hundred views and a couple hundred
> > > tables.
> > >
> > > Is there some way I can find out which views have been accessed in the
> > > last 6 months or so?  Or some way to log this?
> > >
> > > I know there are views and tables that are no longer in used by my
> > > application and I am looking for a way to identify them.
> >
> > Look at the pg_catalog.pg_stat* tables
> >
>
> I fail to see how that gives him any answer on the views, and tables no
> longer used. AFAICT, there's no way to know for views (apart from
> logging all queries in the log). As for tables, still apart from the
> log, pg_stat_user_tables could give an answer if he was monitoring it at
> least the last six months.
>
>

Thanks for the replies.  Views were my main problem.  My application
could use some cleanup.  Doing is manually is probably the best
approach.  I was just looking for a jump start. 



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


pgsql-general by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Streaming Replication and Firewall
Next
From: Joe Abbate
Date:
Subject: PL/Python