Re: Database takes up MUCH more disk space than it should - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Database takes up MUCH more disk space than it should
Date
Msg-id 201201211315.39452.adrian.klaver@gmail.com
Whole thread Raw
In response to Database takes up MUCH more disk space than it should  (Dan Charrois <dan001@syz.com>)
Responses Re: Database takes up MUCH more disk space than it should  (Dan Charrois <dan001@syz.com>)
List pgsql-general
On Saturday, January 21, 2012 12:37:17 am Dan Charrois wrote:
> Hi everyone.  I'm currently in the situation of administering a rather
> large PostgreSQL database which for some reason seems to be even much
> larger than it should be.
>
> I'm currently running version 8.4.5 - not the latest and greatest, I know -
> but this is a live database that would problematic to take down to upgrade
> unless all else fails - especially considering its size if it does need to
> be rebuilt somehow.
>

>
> The size of the tables reported by \dt+ add up to around 120 GB.  The size
> of the indexes reported with \di+ adds up to around 15 GB.  This is pretty
> consistent with what I would expect the data to require.
>
> The problem is, the disk usage of the pgsql directory where the data is
> kept (as reported by 'du') comes to 647 GB - significantly more than it
> should.  select pg_database_size('mydatabase') confirms this, returning
> 690830939920.
>

>
> So how do I find out what's eating up all this extra space?
>
> I'm not sure this is related, but in doing a bit of digging I ran across
> the following command to try and see where the space is being used:
>
> SELECT relname, pg_size_pretty(relpages::bigint * 8 *1024) AS size, CASE
> WHEN relkind = 't' THEN (SELECT pgd.relname FROM pg_class pgd WHERE
> pgd.relfilenode::text = SUBSTRING(pg.relname FROM 10)) ELSE (SELECT
> pgc.relname FROM pg_class pgc WHERE pg.reltoastrelid = pgc.relfilenode)
> END AS refrelname, relfilenode, relkind, reltuples::bigint, relpages FROM
> pg_class pg ORDER BY relpages DESC;
>
> The biggest culprit in this is a file named pg_toast_101748 which weighs in
> at 242 GB.  I understand that the toast files are supplemental storage
> files linked to tables, but I'm wondering if that particular file (and
> perhaps others) have lost their links?  The reason I consider this is the
> third column - which typically shows database names corresponding to most
> other toast files, is completely empty for that one.  There are other
> toast files too that don't seem to refer to a "real" database, but they
> only weight in at 2 GB or less, so they're less of a problem.
>

If I follow the query above correctly, it is not getting the information you
think it is. In particular this part:

...SELECT pgc.relname FROM pg_class pgc WHERE pg.reltoastrelid = pgc.relfilenode)

Per the docs:
http://www.postgresql.org/docs/8.4/interactive/catalog-pg-class.html
reltoastrelid = The OID of the TOAST table not the relfilenode
When I table is created those numbers are the same, but they can diverge over
time.

I would do something like
select oid, relfilenode, relname from pg_class where relname = 'pg_toast_101748';

This will get you the OID and also show if it differs from the relfilenode.

Then something like:
select * from pg_class where relkind='r' and reltoastrelid=[oid from above]

This should show you if the TOAST table has been orphaned and if not what table
it is associated with.

>
> I've done a bunch of Google searching and haven't come up with anything so
> far to shed some light on this.  Any help someone could provide on how to
> figure out where this substantial amount of extra disk space is being used
> would be greatly appreciated!
>
> Thanks!
>
> Dan
> --
> Syzygy Research & Technology
> Box 83, Legal, AB  T0G 1L0 Canada
> Phone: 780-961-2213

--
Adrian Klaver
adrian.klaver@gmail.com

pgsql-general by date:

Previous
From: Jose Carlos Martinez Llario
Date:
Subject: log_statement variable does not admit all value
Next
From: Adrian Klaver
Date:
Subject: Re: log_statement variable does not admit all value