Thread: Estimating Database Disk Space
Hi All, I'm running postgres.7.1.3-7 I'm concerned at the steady growth of disk space taken up by my postgres. I have three databases, each of which take up about 2+ MB when dumped. I have tried deleting one database and re-installing. This suggests that about 38Mb of disk space is taken up by each database. I suspect that if postgres were deleted entirely, about 250 MB would be freed. The FAQ suggests that about 7MB might be taken up by one database when copied from a file into postgres. Other posts suggest that xlog might add a further 20MB as an overhead. Indexes are mentioned elsewhere, but without any means of estimating what their size should be. Do the above figures seem sensible? Is the required disk space likely to grow in proportion to the data, or is there an overhead of about 60MB of disk space needed to run each database? TIA Richard A Lough
Richard A Lough <ralough.ced@dnet.co.uk> writes: > I have three databases, each of which take up about 2+ MB when dumped. > I have tried deleting one database and re-installing. This suggests that > about 38Mb of disk space is taken up by each database. I suspect that if > postgres were deleted entirely, about 250 MB would be freed. A "du" on your $PGDATA directory would give you some facts instead of guesses. FWIW, the per-database overhead is a couple of megabytes in my experience (basically the size of another set of system catalogs). pg_xlog will occupy some small multiple of 16MB depending on your WAL configuration settings. This is independent of how many databases you have in the installation --- but it is affected by your transaction rate. If your installation has been around for awhile then you might be noticing growth of pg_log --- it grows at a steady rate of 2 bits per transaction. In 7.2 and later there are provisions to recycle that space, but not in 7.1. Or you might be seeing problems with index bloat. Hard to tell with no breakdown of the space usage to go on. regards, tom lane
Tom Lane wrote: > > Richard A Lough <ralough.ced@dnet.co.uk> writes: > > I have three databases, each of which take up about 2+ MB when dumped. > > I have tried deleting one database and re-installing. This suggests that > > about 38Mb of disk space is taken up by each database. I suspect that if > > postgres were deleted entirely, about 250 MB would be freed. > > A "du" on your $PGDATA directory would give you some facts instead of > guesses. > It seems to have grown a little, du says: 250540 base 32812 pg_xlog 472 global 4 postmaster.pid 4 postmaster.opts 4 PG_VERSION 0 postgresql.conf Databases, dumped in text format are 3.3MB, 3.3MB and 1.6MB, though there are be a couple of empty temporary databases on the system as well. Thanks for your reply Richard A Lough
Richard A Lough <ralough.ced@dnet.co.uk> writes: >> A "du" on your $PGDATA directory would give you some facts instead of >> guesses. >> > It seems to have grown a little, du says: > 250540 base > 32812 pg_xlog > 472 global > 4 postmaster.pid > 4 postmaster.opts > 4 PG_VERSION > 0 postgresql.conf Okay, so your space usage is indeed mostly in the databases, and not the overhead like pg_xlog. > Databases, dumped in text format are 3.3MB, 3.3MB and 1.6MB, > though there are be a couple of empty temporary databases 250MB of disk space does seem a tad high for that much data. Have you vacuumed lately? If so, you could drill down to the next level of detail by looking at the relpages column of pg_class in each database to see which tables or indexes are using the most space. (relpages is measured in units of 8K disk blocks, btw, and it's only approximate because it's generally only updated by vacuum.) Something like this would give you the top ten offenders: select relname, relpages from pg_class order by relpages desc limit 10; regards, tom lane
Tom Lane wrote: > SNIP > > Databases, dumped in text format are 3.3MB, 3.3MB and 1.6MB, > > though there are be a couple of empty temporary databases > > 250MB of disk space does seem a tad high for that much data. Have you > vacuumed lately? If so, you could drill down to the next level of > detail by looking at the relpages column of pg_class in each database > to see which tables or indexes are using the most space. (relpages is > measured in units of 8K disk blocks, btw, and it's only approximate > because it's generally only updated by vacuum.) Something like this > would give you the top ten offenders: > > select relname, relpages from pg_class order by relpages desc limit 10; > > regards, tom lane > Hmmm.. \l tells me I have nine databases, including postgres, lo_demo, template0, and template1. Template0 refuses a connection, and three of the other databases are empty. Only one of my active databases shows significant figures (recently increased): relpages | relname 17473 pg_attribute_relid_attnam_index 7267 pg_attribute_relid_attnum_index 3245 pg_attribute 1012 pg_statistic_relid_att_index 646 pg_class_oid_index 645 pg_type_oid_index 507 pg_class_relname_index 478 pg_type_typname_index 77 pg_class 76 pg_type Of these pg_attribute and pg_statistic_relid_att_index have grown significantly (3245/909, and 1012/549 respectively). At most I have added an average of one tuple to the 3126 tables. I'm adding frequently, so size is a moving target. I had a look (I think) at the indexes for nasdaq01 using pgaccess. No indexes were shown, I'm not sure I got that right. Size dumped with pg_dump is 2.6MB for the relpages given above. Richard A Lough
Richard A Lough <ralough.ced@dnet.co.uk> writes: > Only one of my active databases > shows significant figures (recently increased): > relpages | relname > 17473 pg_attribute_relid_attnam_index > 7267 pg_attribute_relid_attnum_index > 3245 pg_attribute Yipes ... you must do a lot of table creation and deletion. The index bloat here is a known result of inefficiency in vacuuming indexes (there's no good way to reclaim index pages at the moment). You could probably fix it by REINDEXing pg_attribute, but it might be easier to just dump, dropdb, createdb, reload that database. Yes, there's a TODO item to improve index vacuuming ... regards, tom lane
Tom Lane wrote: > > Richard A Lough <ralough.ced@dnet.co.uk> writes: > > Only one of my active databases > > shows significant figures (recently increased): > > > relpages | relname > > 17473 pg_attribute_relid_attnam_index > > 7267 pg_attribute_relid_attnum_index > > 3245 pg_attribute > > Yipes ... you must do a lot of table creation and deletion. > > The index bloat here is a known result of inefficiency in vacuuming > indexes (there's no good way to reclaim index pages at the moment). > You could probably fix it by REINDEXing pg_attribute, but it might > be easier to just dump, dropdb, createdb, reload that database. > I kinda suspected something was not quite right. The input method is a perl script which was "borrowed." It's probably not the most efficient for this task, but it works and code reuse is good, no? :-) > Yes, there's a TODO item to improve index vacuuming ... > Speaking of TODO's, I think I have enough material for a first draft on disk storage use. If you can point me at your documentation people I will forward it. Thanks again for your help. Richard A Lough
Richard A Lough <ralough.ced@dnet.co.uk> writes: > Speaking of TODO's, I think I have enough material for a first draft > on disk storage use. If you can point me at your documentation people > I will forward it. pgsql-docs mailing list is the right place for that sort of discussion. regards, tom lane