Re: Help estimating database and WAL size - Mailing list pgsql-general

From Jasen Betts
Subject Re: Help estimating database and WAL size
Date
Msg-id k5b1ij$74q$1@reversiblemaps.ath.cx
Whole thread Raw
In response to Help estimating database and WAL size  ("Daniel Serodio (lists)" <daniel.lists@mandic.com.br>)
Responses Re: Help estimating database and WAL size  ("Daniel Serodio (lists)" <daniel.lists@mandic.com.br>)
List pgsql-general
On 2012-10-08, Daniel Serodio (lists) <daniel.lists@mandic.com.br> wrote:
> We are preparing a PostgreSQL database for production usage and we need
> to estimate the storage size for this database. We're a team of
> developers with low expertise on database administration, so we are
> doing research, reading manuals and using our general IT knowledge to
> achieve this.
>
> We have actual data to migrate to this database and some rough
> estimations of growth. For the sake of the example, let's say we have a
> estimation of growth of 50% per year.
>
> The point is: what's the general proper technique for doing a good size
> estimation?
>
> We are estimating the storage usage by the following rules. Topics where
> we need advice are marked with ** asterisks **. Feedback on the whole
> process is more than welcome.
>
> 1) Estimate the size of each table
>      1.1) Discover the actual size of each row.
>          - For fields with a fixed size (like bigint, char, etc) we used
> the sizes described in the documentation
>          - For fields with a dynamic size (like text) we estimated the
> string length and used the function select pg_column_size('expected text
> here'::text)

long text is subject to compression, pg_column_size doesn't seem to
test compression, compression is some sort of LZ..

>          - We added 4 more bytes for the OID that PostgreSQL uses internally

OID is optional, IIRC PGXID is not

>      1.2) Multiply the size of each row by the number of estimated rows
>      ** Do I need to consider any overhead here, like row or table
> metadata? **

page size 8K
column overhead 1 byte per not-NULL column, NULLs are free,

> 2) Estimate the size of each table index
>      ** Don't know how to estimate this, need advice here **

IIRC
( data being indexed + 8 bytes ) / fill factor


> 3) Estimate the size of the transaction log
>      ** We've got no idea how to estimate this, need advice **

how big are your transactions?

> 4) Estimate the size of the backups (full and incremental)
>      ** Don't know how to estimate this, need advice here **

depends on the format you use, backups tend to compress well.

> 5) Sum all the estimates for the actual minimum size

no, you get estimated size.

> 6) Apply a factor of 1.5x (the 50% growth) to the sum of the estimates
> 1, 2 and 4 for the minimum size after 1 year
>
> 7) Apply an overall factor of 1.2 ~ 1.4 (20% to 40% more) to estimates 5
> and 6 for a good safety margin
>
> I know the rules got pretty extensive, please let me know if you need
> more data or examples for a better understanding.
>
> We've also posted this question to
> http://dba.stackexchange.com/q/25617/10166
>
> Thanks in advance,
> Daniel Serodio
>
>


--
⚂⚃ 100% natural

pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: Postgres DB Migration from 8.3 to 9.1
Next
From: John R Pierce
Date:
Subject: Re: Help estimating database and WAL size