Re: Help estimating database and WAL size - Mailing list pgsql-general
From | Jasen Betts |
---|---|
Subject | Re: Help estimating database and WAL size |
Date | |
Msg-id | k5b1ij$74q$1@reversiblemaps.ath.cx Whole thread Raw |
In response to | Help estimating database and WAL size ("Daniel Serodio (lists)" <daniel.lists@mandic.com.br>) |
Responses |
Re: Help estimating database and WAL size
|
List | pgsql-general |
On 2012-10-08, Daniel Serodio (lists) <daniel.lists@mandic.com.br> wrote: > We are preparing a PostgreSQL database for production usage and we need > to estimate the storage size for this database. We're a team of > developers with low expertise on database administration, so we are > doing research, reading manuals and using our general IT knowledge to > achieve this. > > We have actual data to migrate to this database and some rough > estimations of growth. For the sake of the example, let's say we have a > estimation of growth of 50% per year. > > The point is: what's the general proper technique for doing a good size > estimation? > > We are estimating the storage usage by the following rules. Topics where > we need advice are marked with ** asterisks **. Feedback on the whole > process is more than welcome. > > 1) Estimate the size of each table > 1.1) Discover the actual size of each row. > - For fields with a fixed size (like bigint, char, etc) we used > the sizes described in the documentation > - For fields with a dynamic size (like text) we estimated the > string length and used the function select pg_column_size('expected text > here'::text) long text is subject to compression, pg_column_size doesn't seem to test compression, compression is some sort of LZ.. > - We added 4 more bytes for the OID that PostgreSQL uses internally OID is optional, IIRC PGXID is not > 1.2) Multiply the size of each row by the number of estimated rows > ** Do I need to consider any overhead here, like row or table > metadata? ** page size 8K column overhead 1 byte per not-NULL column, NULLs are free, > 2) Estimate the size of each table index > ** Don't know how to estimate this, need advice here ** IIRC ( data being indexed + 8 bytes ) / fill factor > 3) Estimate the size of the transaction log > ** We've got no idea how to estimate this, need advice ** how big are your transactions? > 4) Estimate the size of the backups (full and incremental) > ** Don't know how to estimate this, need advice here ** depends on the format you use, backups tend to compress well. > 5) Sum all the estimates for the actual minimum size no, you get estimated size. > 6) Apply a factor of 1.5x (the 50% growth) to the sum of the estimates > 1, 2 and 4 for the minimum size after 1 year > > 7) Apply an overall factor of 1.2 ~ 1.4 (20% to 40% more) to estimates 5 > and 6 for a good safety margin > > I know the rules got pretty extensive, please let me know if you need > more data or examples for a better understanding. > > We've also posted this question to > http://dba.stackexchange.com/q/25617/10166 > > Thanks in advance, > Daniel Serodio > > -- ⚂⚃ 100% natural
pgsql-general by date: