Thread: backup getting larger and larger
I still have to investigate if the tables are getting really larger... but at a first guess there shouldn't be any good reason to see tables getting so large so fast... so I was wondering if anything could contribute to make a backup much larger than it was other than table containing more records? The only thing that should have been really changed is the number of concurrent connections during a backup. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
Ivan Sergio Borgonovo wrote: > I still have to investigate if the tables are getting really > larger... but at a first guess there shouldn't be any good reason to > see tables getting so large so fast... so I was wondering if > anything could contribute to make a backup much larger than it was > other than table containing more records? > > The only thing that should have been really changed is the number of > concurrent connections during a backup. > Can we assume that by backup you mean pg_dump/pg_dumpall? If so, then the change is likely due to increasing data in the database. I have a daily report that emails me a crude but useful estimate of table utilization based on this query: select relname as table, to_char(8*relpages, '999,999,999') as "size (kB)", (100.0*relpages/(select sum(relpages) from pg_class where relkind='r'))::numeric(4,1) as percent from pg_class where relkind = 'r' order by relpages desc limit 20; If, however, you are doing a filesystem backup then table and index bloat could, indeed, increase your backup size. But more importantly, you will likely be in for a rude surprise should you ever need to restore. Cheers, Steve
On Tue, 14 Apr 2009 13:26:24 -0700 Steve Crawford <scrawford@pinpointresearch.com> wrote: > Ivan Sergio Borgonovo wrote: > > I still have to investigate if the tables are getting really > > larger... but at a first guess there shouldn't be any good > > reason to see tables getting so large so fast... so I was > > wondering if anything could contribute to make a backup much > > larger than it was other than table containing more records? > > > > The only thing that should have been really changed is the > > number of concurrent connections during a backup. > > > Can we assume that by backup you mean pg_dump/pg_dumpall? If so, > then the change is likely due to increasing data in the database. > I have a daily report that emails me a crude but useful estimate > of table utilization based on this query: > > select > relname as table, > to_char(8*relpages, '999,999,999') as "size (kB)", > (100.0*relpages/(select sum(relpages) from pg_class where > relkind='r'))::numeric(4,1) as percent > from > pg_class > where > relkind = 'r' > order by > relpages desc > limit 20; Thanks, very useful. May I place it on my site as a reference, giving credits of course? Still puzzled... The first and second largest table make up for 70% of the overall DB size (1st 53%, 2nd 16.1%) The second one have very few small fields but ~2x the number of records of the first. Comparatively a row of the first one is at least 10x larger than a row in the second one. The first has 1M records. All the others following with a size larger than 1% grow as 1x the number of records of the first one. I had an increment of less than 10% of the number of records of the first table but an increment of roughly 80% of the size of backup. Maybe it is due to compression. The table that grew more can't be shrunk too well. -- Ivan Sergio Borgonovo http://www.webthatworks.it
Steve Crawford wrote: > Ivan Sergio Borgonovo wrote: >> I still have to investigate if the tables are getting really >> larger... but at a first guess there shouldn't be any good reason to >> see tables getting so large so fast... so I was wondering if >> anything could contribute to make a backup much larger than it was >> other than table containing more records? >> >> The only thing that should have been really changed is the number of >> concurrent connections during a backup. >> > Can we assume that by backup you mean pg_dump/pg_dumpall? If so, then > the change is likely due to increasing data in the database. I have a > daily report that emails me a crude but useful estimate of table > utilization based on this query: > > select > relname as table, > to_char(8*relpages, '999,999,999') as "size (kB)", > (100.0*relpages/(select sum(relpages) from pg_class where > relkind='r'))::numeric(4,1) as percent > from > pg_class > where > relkind = 'r' > order by > relpages desc > limit 20; The better way to do this would likely be to use the pg_*_size functions detailed here: http://www.postgresql.org/docs/8.3/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE In particular pg_total_relation_size() , |pg_size_pretty|(), and the like... Seems much more straightforward than the queries mentioned above.. -- Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 919-463-0999/877-258-8987 http://www.otg-nc.com Ask me about expert PostgreSQL training, delivered worldwide!
Chander Ganesan wrote: > Steve Crawford wrote: >> Ivan Sergio Borgonovo wrote: >>> I still have to investigate if the tables are getting really >>> larger... >> Can we assume that by backup you mean pg_dump/pg_dumpall? If so, then >> the change is likely due to increasing data in the database. I have a >> daily report that emails me a crude but useful estimate of table >> utilization based on this query: >> ... > The better way to do this would likely be to use the pg_*_size > functions detailed here: > > http://www.postgresql.org/docs/8.3/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE > > > In particular pg_total_relation_size() , |pg_size_pretty|(), and the > like... Seems much more straightforward than the queries mentioned > above.. > You are, of course, correct. That is a better choice if you are up-to-date. Unfortunately I still have some lingering servers that haven't been upgraded to a version that includes that functionality (and I know I'm not alone...) :) Cheers, Steve