Thread: backup getting larger and larger

backup getting larger and larger

From
Ivan Sergio Borgonovo
Date:
I still have to investigate if the tables are getting really
larger... but at a first guess there shouldn't be any good reason to
see tables getting so large so fast... so I was wondering if
anything could contribute to make a backup much larger than it was
other than table containing more records?

The only thing that should have been really changed is the number of
concurrent connections during a backup.

thanks

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: backup getting larger and larger

From
Steve Crawford
Date:
Ivan Sergio Borgonovo wrote:
> I still have to investigate if the tables are getting really
> larger... but at a first guess there shouldn't be any good reason to
> see tables getting so large so fast... so I was wondering if
> anything could contribute to make a backup much larger than it was
> other than table containing more records?
>
> The only thing that should have been really changed is the number of
> concurrent connections during a backup.
>
Can we assume that by backup you mean pg_dump/pg_dumpall? If so, then
the change is likely due to increasing data in the database. I have a
daily report that emails me a crude but useful estimate of table
utilization based on this query:

select
  relname as table,
  to_char(8*relpages, '999,999,999')  as "size (kB)",
  (100.0*relpages/(select sum(relpages) from pg_class where
relkind='r'))::numeric(4,1) as percent
from
  pg_class
where
  relkind = 'r'
order by
  relpages desc
limit 20;


If, however, you are doing a filesystem backup then table and index
bloat could, indeed, increase your backup size. But more importantly,
you will likely be in for a rude surprise should you ever need to restore.

Cheers,
Steve


Re: backup getting larger and larger

From
Ivan Sergio Borgonovo
Date:
On Tue, 14 Apr 2009 13:26:24 -0700
Steve Crawford <scrawford@pinpointresearch.com> wrote:

> Ivan Sergio Borgonovo wrote:
> > I still have to investigate if the tables are getting really
> > larger... but at a first guess there shouldn't be any good
> > reason to see tables getting so large so fast... so I was
> > wondering if anything could contribute to make a backup much
> > larger than it was other than table containing more records?
> >
> > The only thing that should have been really changed is the
> > number of concurrent connections during a backup.
> >
> Can we assume that by backup you mean pg_dump/pg_dumpall? If so,
> then the change is likely due to increasing data in the database.
> I have a daily report that emails me a crude but useful estimate
> of table utilization based on this query:
>
> select
>   relname as table,
>   to_char(8*relpages, '999,999,999')  as "size (kB)",
>   (100.0*relpages/(select sum(relpages) from pg_class where
> relkind='r'))::numeric(4,1) as percent
> from
>   pg_class
> where
>   relkind = 'r'
> order by
>   relpages desc
> limit 20;

Thanks, very useful.
May I place it on my site as a reference, giving credits of course?

Still puzzled...

The first and second largest table make up for 70% of the overall DB
size (1st 53%, 2nd 16.1%)
The second one have very few small fields but ~2x the number of
records of the first.
Comparatively a row of the first one is at least 10x larger than a
row in the second one.
The first has 1M records.

All the others following with a size larger than 1% grow as 1x the
number of records of the first one.
I had an increment of less than 10% of the number of records of the
first table but an increment of roughly 80% of the size of backup.

Maybe it is due to compression. The table that grew more can't be
shrunk too well.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: backup getting larger and larger

From
Chander Ganesan
Date:
Steve Crawford wrote:
> Ivan Sergio Borgonovo wrote:
>> I still have to investigate if the tables are getting really
>> larger... but at a first guess there shouldn't be any good reason to
>> see tables getting so large so fast... so I was wondering if
>> anything could contribute to make a backup much larger than it was
>> other than table containing more records?
>>
>> The only thing that should have been really changed is the number of
>> concurrent connections during a backup.
>>
> Can we assume that by backup you mean pg_dump/pg_dumpall? If so, then
> the change is likely due to increasing data in the database. I have a
> daily report that emails me a crude but useful estimate of table
> utilization based on this query:
>
> select
>  relname as table,
>  to_char(8*relpages, '999,999,999')  as "size (kB)",
>  (100.0*relpages/(select sum(relpages) from pg_class where
> relkind='r'))::numeric(4,1) as percent
> from
>  pg_class
> where
>  relkind = 'r'
> order by
>  relpages desc
> limit 20;
The better way to do this would likely be to use the pg_*_size functions
detailed here:

http://www.postgresql.org/docs/8.3/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE

In particular pg_total_relation_size() , |pg_size_pretty|(), and the
like...  Seems much more straightforward than the queries mentioned above..

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/877-258-8987
http://www.otg-nc.com
Ask me about expert PostgreSQL training, delivered worldwide!


Re: backup getting larger and larger

From
Steve Crawford
Date:
Chander Ganesan wrote:
> Steve Crawford wrote:
>> Ivan Sergio Borgonovo wrote:
>>> I still have to investigate if the tables are getting really
>>> larger...
>> Can we assume that by backup you mean pg_dump/pg_dumpall? If so, then
>> the change is likely due to increasing data in the database. I have a
>> daily report that emails me a crude but useful estimate of table
>> utilization based on this query:
>> ...
> The better way to do this would likely be to use the pg_*_size
> functions detailed here:
>
> http://www.postgresql.org/docs/8.3/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE
>
>
> In particular pg_total_relation_size() , |pg_size_pretty|(), and the
> like...  Seems much more straightforward than the queries mentioned
> above..
>
You are, of course, correct. That is a better choice if you are up-to-date.

Unfortunately I still have some lingering servers that haven't been
upgraded to a version that includes that functionality (and I know I'm
not alone...) :)

Cheers,
Steve