Re: Help estimating database and WAL size - Mailing list pgsql-general
| From | Jasen Betts |
|---|---|
| Subject | Re: Help estimating database and WAL size |
| Date | |
| Msg-id | k5b1ij$74q$1@reversiblemaps.ath.cx Whole thread Raw |
| In response to | Help estimating database and WAL size ("Daniel Serodio (lists)" <daniel.lists@mandic.com.br>) |
| Responses |
Re: Help estimating database and WAL size
|
| List | pgsql-general |
On 2012-10-08, Daniel Serodio (lists) <daniel.lists@mandic.com.br> wrote:
> We are preparing a PostgreSQL database for production usage and we need
> to estimate the storage size for this database. We're a team of
> developers with low expertise on database administration, so we are
> doing research, reading manuals and using our general IT knowledge to
> achieve this.
>
> We have actual data to migrate to this database and some rough
> estimations of growth. For the sake of the example, let's say we have a
> estimation of growth of 50% per year.
>
> The point is: what's the general proper technique for doing a good size
> estimation?
>
> We are estimating the storage usage by the following rules. Topics where
> we need advice are marked with ** asterisks **. Feedback on the whole
> process is more than welcome.
>
> 1) Estimate the size of each table
> 1.1) Discover the actual size of each row.
> - For fields with a fixed size (like bigint, char, etc) we used
> the sizes described in the documentation
> - For fields with a dynamic size (like text) we estimated the
> string length and used the function select pg_column_size('expected text
> here'::text)
long text is subject to compression, pg_column_size doesn't seem to
test compression, compression is some sort of LZ..
> - We added 4 more bytes for the OID that PostgreSQL uses internally
OID is optional, IIRC PGXID is not
> 1.2) Multiply the size of each row by the number of estimated rows
> ** Do I need to consider any overhead here, like row or table
> metadata? **
page size 8K
column overhead 1 byte per not-NULL column, NULLs are free,
> 2) Estimate the size of each table index
> ** Don't know how to estimate this, need advice here **
IIRC
( data being indexed + 8 bytes ) / fill factor
> 3) Estimate the size of the transaction log
> ** We've got no idea how to estimate this, need advice **
how big are your transactions?
> 4) Estimate the size of the backups (full and incremental)
> ** Don't know how to estimate this, need advice here **
depends on the format you use, backups tend to compress well.
> 5) Sum all the estimates for the actual minimum size
no, you get estimated size.
> 6) Apply a factor of 1.5x (the 50% growth) to the sum of the estimates
> 1, 2 and 4 for the minimum size after 1 year
>
> 7) Apply an overall factor of 1.2 ~ 1.4 (20% to 40% more) to estimates 5
> and 6 for a good safety margin
>
> I know the rules got pretty extensive, please let me know if you need
> more data or examples for a better understanding.
>
> We've also posted this question to
> http://dba.stackexchange.com/q/25617/10166
>
> Thanks in advance,
> Daniel Serodio
>
>
--
⚂⚃ 100% natural
pgsql-general by date: