Thread: How to evaluate disk space needed by a table

How to evaluate disk space needed by a table

From
高健
Date:
Hello:

I  created a table, and found the file created for that table is about 10 times of that I estimated!
The following is what I did:

postgres=# create table tst01(id integer);
CREATE TABLE
postgres=# 

postgres=# select oid from pg_class where relname='tst01';
  oid  
-------
 16384
(1 row)
Then I can see the file now:
[root@lex base]# ls ./12788/16384
./12788/16384

I heard that one integer type  will  use 4 bytes. 
so I think  that  2048 records with only one column of integer data type, 
will use a little more than 8K(2048 records *  4 bytes/per integer data type + headers).

But in fact they use so much more:
After I run this:
postgres=# insert into tst01 values(generate_series(1,2048));
INSERT 0 2048
postgres=# 

I can find  the file 16384  is  now 80KB!
[root@lex base]# ls -lrt ./12788/16384
-rw------- 1 postgres postgres 81920 May 28 11:54 ./12788/16384
[root@lex base]# ls -lrt -kb ./12788/16384
-rw------- 1 postgres postgres 80 May 28 11:54 ./12788/16384
[root@lex base]# 

Then I tried again , I put another 2048 records:
postgres=# insert into tst01 values(generate_series(2049,4096));
INSERT 0 2048
postgres=#
And found that the file is now 152KB!
[root@lex base]# ls -lrt -kb ./12788/16384
-rw------- 1 postgres postgres 152 May 28 11:56 ./12788/16384
[root@lex base]# 

Before this, I have thought that   headers and  other structure will just use a little space.
But  what I found is about 10 times the space I evaluated.

So , Is there any  method to correctly evaluate  disk space one table will need, 
given the table's column data types and , estimated record numbers ?
 

Re: How to evaluate disk space needed by a table

From
Stephen Frost
Date:
* 高健 (luckyjackgao@gmail.com) wrote:
> So , Is there any  method to correctly evaluate  disk space one table will
> need,
> given the table's column data types and , estimated record numbers ?

The simplest might be to do exactly what you did- create the table and
then check the size with a subset of records.  It won't be exactly
linear from there but it'd provide a good estimate.

Otherwise, you need to consider the various additional fields which
PostgreSQL adds to every tuple to keep track of visibility and other
information.  Then you have to add in the page header and the other bits
of the tuple header beyond the system columns.  Lastly, you need to
figure out how many tuples will actually fit on a page based on their
size, because there will be gaps if the tuple doesn't fit exactly into
the remaining space in the page.

btw, there are helper functions to get disk usage-
pg_total_relation_size() and pg_relation_size() come to mind, though
there are also others.

    Thanks,

        Stephen

Attachment

Re: How to evaluate disk space needed by a table

From
Raghavendra
Date:
On Tue, May 28, 2013 at 9:48 AM, 高健 <luckyjackgao@gmail.com> wrote:
Hello:

I  created a table, and found the file created for that table is about 10 times of that I estimated!
The following is what I did:

postgres=# create table tst01(id integer);
CREATE TABLE
postgres=# 

postgres=# select oid from pg_class where relname='tst01';
  oid  
-------
 16384
(1 row)
Then I can see the file now:
[root@lex base]# ls ./12788/16384
./12788/16384

I heard that one integer type  will  use 4 bytes. 
so I think  that  2048 records with only one column of integer data type, 
will use a little more than 8K(2048 records *  4 bytes/per integer data type + headers).


You heard right, as other said there are various hidden fileds added to every tuple like (ctid,xmin,xmax,cmin,cmax). All these occupy some bytes in the page. Take your example.

As per integer column, every column data occupies 4 bytes.

postgres=# select pg_column_size(id) from tst01 limit 1;
 pg_column_size
----------------
              4
(1 row)

When you calculate the row size...

postgres=# select pg_column_size(t) from tst01 t limit 1;
 pg_column_size
----------------
             28
(1 row)

Here 24 bytes as row header and  4 bytes of integer data.

---
Regards,
Raghavendra
EnterpriseDB Corporation