Thread: postgresql storage and performance questions

postgresql storage and performance questions

From
"Josh Harrison"
Date:
Hi,
I have a few questions about the storage and performance

1. How do you estimate the table size in postgresql?
For example if I have a table 'Dummy' with 1 varchar (40) & 1
numeric(22,0) fields and 1000 rows, what is the tablesize estimate for
this (including the row overhead etc)? How many pages will this
occupy?

2. Also if the table contains null columns, does postgres allocates
the same space for these nulls columns? How does it handle 'nulls' in
terms of storage?

3. How does oracle handle these 2 cases?

4. Does increasing the block size in postgres improve query performance?

Thanks in advance
Josh

Re: postgresql storage and performance questions

From
"Filip Rembiałkowski"
Date:
2007/11/19, Josh Harrison <joshques@gmail.com>:
> Hi,
> I have a few questions about the storage and performance
>
> 1. How do you estimate the table size in postgresql?
> For example if I have a table 'Dummy' with 1 varchar (40) & 1
> numeric(22,0) fields and 1000 rows, what is the tablesize estimate for
> this (including the row overhead etc)? How many pages will this
> occupy?
>
> 2. Also if the table contains null columns, does postgres allocates
> the same space for these nulls columns? How does it handle 'nulls' in
> terms of storage?

Try these:
http://www.postgresql.org/docs/current/static/datatype-numeric.html
http://www.postgresql.org/docs/current/static/datatype-character.html
http://www.postgresql.org/docs/current/static/storage-page-layout.html

ad.1 )  There is a fixed-size header (occupying 27 bytes on most
machines) for each tuple

so you will have about 27 + 1 +  varchar data + numeric data per row,
plus some overhaed for block headers

ad.2 ) there is a null bitmap for each tuple which has nullable fields
- so every 8 NULLable columns occupy one byte bitmap.


PS. why do you post same thing many times? this is kinda.. spam?

--
Filip Rembiałkowski

Re: postgresql storage and performance questions

From
"Josh Harrison"
Date:
Thanks Filip.
I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ).
In table1 both the cols are filled and in table2  the varchar colm is null

So when I checked the tablesize for these two tables (using pg_relation_size)
table1 - 57344 bytes (no null columns)
table2 - 49152 bytes (varchar colm is null)

There is not much difference between the two sizes.So even if a column
is null postgresql still has lots of overhead.
 Does postgres occupy space even when the column is NULL?

This is not a spam.... I posted it twice becoz my question  didnot
show up the first time in the mailing list even after 30 minutes. So i
tried again and then both showed up...kind of strange though!

Thanks again
Josh

On Nov 19, 2007 1:37 PM, Filip Rembiałkowski <plk.zuber@gmail.com> wrote:
> 2007/11/19, Josh Harrison <joshques@gmail.com>:
> > Hi,
> > I have a few questions about the storage and performance
> >
> > 1. How do you estimate the table size in postgresql?
> > For example if I have a table 'Dummy' with 1 varchar (40) & 1
> > numeric(22,0) fields and 1000 rows, what is the tablesize estimate for
> > this (including the row overhead etc)? How many pages will this
> > occupy?
> >
> > 2. Also if the table contains null columns, does postgres allocates
> > the same space for these nulls columns? How does it handle 'nulls' in
> > terms of storage?
>
> Try these:
> http://www.postgresql.org/docs/current/static/datatype-numeric.html
> http://www.postgresql.org/docs/current/static/datatype-character.html
> http://www.postgresql.org/docs/current/static/storage-page-layout.html
>
> ad.1 )  There is a fixed-size header (occupying 27 bytes on most
> machines) for each tuple
>
> so you will have about 27 + 1 +  varchar data + numeric data per row,
> plus some overhaed for block headers
>
> ad.2 ) there is a null bitmap for each tuple which has nullable fields
> - so every 8 NULLable columns occupy one byte bitmap.
>
>
> PS. why do you post same thing many times? this is kinda.. spam?
>
> --
> Filip Rembiałkowski
>

Re: postgresql storage and performance questions

From
"Trevor Talbot"
Date:
On 11/19/07, Josh Harrison <joshques@gmail.com> wrote:

> I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ).
> In table1 both the cols are filled and in table2  the varchar colm is null
>
> So when I checked the tablesize for these two tables (using pg_relation_size)
> table1 - 57344 bytes (no null columns)
> table2 - 49152 bytes (varchar colm is null)
>
> There is not much difference between the two sizes.So even if a column
> is null postgresql still has lots of overhead.
>  Does postgres occupy space even when the column is NULL?

PostgreSQL's disk storage works in "pages", where each page is 8KB.
It will use as much space within each page as it can.  Filip's last
link details this.

Is there a specific reason you're looking at this, as in you have some
requirement to meet?  Or just curious how it works?

Re: postgresql storage and performance questions

From
"Josh Harrison"
Date:
We are working on migrating our database from oracle to postgres.
Postgres tablesize is twice than oracle tablesize for all my
tables.And so the query also takes twice as much time than oracle. So
we were checking to see what makes postgres slower than oracle even
for basic full tablescan queries.
There were a couple of things we noted.
1. Tablesize twice as much than oracle-- Im not sure if postgres null
columns has any overhead since  we have lots of null columns in our
tables.Does postgresql has lots of overhead for null columns?
2. Oracle seems to be reading larger bocks than postgresql (when we
examined the iostat and vmstat) (we had set postgres' db block size as
8 and oracle's is 16kb...)
Do you have any comments on this?

Thanks in advance
josh

On Nov 20, 2007 12:37 AM, Trevor Talbot <quension@gmail.com> wrote:
> On 11/19/07, Josh Harrison <joshques@gmail.com> wrote:
>
> > I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ).
> > In table1 both the cols are filled and in table2  the varchar colm is null
> >
> > So when I checked the tablesize for these two tables (using pg_relation_size)
> > table1 - 57344 bytes (no null columns)
> > table2 - 49152 bytes (varchar colm is null)
> >
> > There is not much difference between the two sizes.So even if a column
> > is null postgresql still has lots of overhead.
> >  Does postgres occupy space even when the column is NULL?
>
> PostgreSQL's disk storage works in "pages", where each page is 8KB.
> It will use as much space within each page as it can.  Filip's last
> link details this.
>
> Is there a specific reason you're looking at this, as in you have some
> requirement to meet?  Or just curious how it works?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>

Re: postgresql storage and performance questions

From
Alvaro Herrera
Date:
Josh Harrison escribió:

> > On 11/19/07, Josh Harrison <joshques@gmail.com> wrote:
> >
> > > I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ).
> > > In table1 both the cols are filled and in table2  the varchar colm is null

> There were a couple of things we noted.
> 1. Tablesize twice as much than oracle-- Im not sure if postgres null
> columns has any overhead since  we have lots of null columns in our
> tables.Does postgresql has lots of overhead for null columns?

No, NULLs are stored as a bitmap for each tuple and they are quite
efficient.

Probably the reason for the difference is the numeric field which Oracle
may be optimizing as a plain integer.   Did you try declaring the column
as INTEGER in Postgres?


Please do not top-post.

--
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"Everything that I think about is more fascinating than the crap in your head."
                               (Dogbert's interpretation of blogger philosophy)

Re: postgresql storage and performance questions

From
"Filip Rembiałkowski"
Date:
2007/11/20, Josh Harrison <joshques@gmail.com>:
> We are working on migrating our database from oracle to postgres.
> Postgres tablesize is twice than oracle tablesize for all my
> tables.
Interesting. Which postgresql version?

>And so the query also takes twice as much time than oracle.
This is even more interesting :) What query? can you show it here
along with EXPLAIN ANALYZE?

Did you do some index tuning or do you just expect the indexes ported
from Oracle schema to work?

Did you run ANALYZE after populating database?

What are server parameters and did you tune postgres config to fit them?

> So
> we were checking to see what makes postgres slower than oracle even
> for basic full tablescan queries.
I'm curious too :) please let me know if you resolve this mystery :)


> There were a couple of things we noted.
> 1. Tablesize twice as much than oracle-- Im not sure if postgres null
> columns has any overhead since  we have lots of null columns in our
> tables.Does postgresql has lots of overhead for null columns?
I've expained this previously - you have a bitmap in each tuple.
Bitmap size is (NATTS+7) % 8

> 2. Oracle seems to be reading larger bocks than postgresql (when we
> examined the iostat and vmstat) (we had set postgres' db block size as
> 8 and oracle's is 16kb...)
yes, 8 kB is default pg block size. it is not recommended to change it
- however it could be useful in some situations - but I doubt it would
speedup your queries twice, whatever they are.



--
Filip Rembiałkowski

Re: postgresql storage and performance questions

From
"Josh Harrison"
Date:
On Nov 20, 2007 8:10 AM, Filip Rembiałkowski <plk.zuber@gmail.com> wrote:
> 2007/11/20, Josh Harrison <joshques@gmail.com>:
> > We are working on migrating our database from oracle to postgres.
> > Postgres tablesize is twice than oracle tablesize for all my
> > tables.
> Interesting. Which postgresql version?
Version 8.2.3

> >And so the query also takes twice as much time than oracle.
> This is even more interesting :) What query? can you show it here
> along with EXPLAIN ANALYZE?

explain analyze select count(*) from dummy1

QUERY PLAN
-------------------------------
 Aggregate  (cost=1192999.60..1192999.61 rows=1 width=0) (actual
time=109792.239..109792.239 rows=1 loops=1)
   ->  Seq Scan on dummy1 (cost=0.00..1119539.48 rows=29384048
30000000 width=0) (actual time=0.027..101428.016 rows=29384048
loops=1)
 Total runtime: 109792.332 ms

Postgresql takes 1m 40s for this query
Oracle takes 45 sec

It is just a count(*) query. I know count(*) query is slower in
postgres becoz it doesn't use index. But in Oracle I give the query as
select /*+full(dummy1)*/ count(*) from dummy1
with the hint so that oracle uses full table scan and not the index scan.

>
> Did you do some index tuning or do you just expect the indexes ported
> from Oracle schema to work?
I created the indexes and Im not sure what kind of tuning neds to be
done for the indexes. But this above query doesnt use any indexes.

> Did you run ANALYZE after populating database?
Yes

> What are server parameters and did you tune postgres config to fit them?
I had attached my config file and the table structure

> > So
> > we were checking to see what makes postgres slower than oracle even
> > for basic full tablescan queries.
> I'm curious too :) please let me know if you resolve this mystery :)
>
>
> > There were a couple of things we noted.
> > 1. Tablesize twice as much than oracle-- Im not sure if postgres null
> > columns has any overhead since  we have lots of null columns in our
> > tables.Does postgresql has lots of overhead for null columns?
> I've expained this previously - you have a bitmap in each tuple.
> Bitmap size is (NATTS+7) % 8
>
> > 2. Oracle seems to be reading larger bocks than postgresql (when we
> > examined the iostat and vmstat) (we had set postgres' db block size as
> > 8 and oracle's is 16kb...)
> yes, 8 kB is default pg block size. it is not recommended to change it
> - however it could be useful in some situations - but I doubt it would
> speedup your queries twice, whatever they are.
>
>
>
> --
> Filip Rembiałkowski
>

Thanks again
josh

Attachment

Re: postgresql storage and performance questions

From
Brad Nicholson
Date:
On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote:

> There were a couple of things we noted.
> 1. Tablesize twice as much than oracle-- Im not sure if postgres null
> columns has any overhead since  we have lots of null columns in our
> tables.Does postgresql has lots of overhead for null columns?

Did you by any chance have an aborted load of the data?  If you load in
a table, and that load fails or does not commit, it will still occupy
the space until you vacuum.  If you try to load again, the table will be
twice the size.

If you want to compact the physical space the table occupies, you can
try running VACUUM FULL on it, and possibly a redindex afterwards.  This
will bring the physical space down to the minimum.  Both of these
operations will lock out access to the tables though.

> 2. Oracle seems to be reading larger bocks than postgresql (when we
> examined the iostat and vmstat) (we had set postgres' db block size as
> 8 and oracle's is 16kb...)
> Do you have any comments on this?

8k is the defualt.  You can change the block size if you need to.  You
need to modify src/include/pg_config_manual.h recompile and re-initdb.
--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: postgresql storage and performance questions

From
"Trevor Talbot"
Date:
On 11/20/07, Josh Harrison <joshques@gmail.com> wrote:

> We are working on migrating our database from oracle to postgres.
> Postgres tablesize is twice than oracle tablesize for all my
> tables.And so the query also takes twice as much time than oracle. So
> we were checking to see what makes postgres slower than oracle even
> for basic full tablescan queries.

A similar question came up recently:
http://archives.postgresql.org/pgsql-general/2007-11/msg00619.php

You won't see anything that dramatic, but you might try to see how
8.3beta does with your test data.

Re: postgresql storage and performance questions

From
"Josh Harrison"
Date:
On Nov 20, 2007 11:13 AM, Brad Nicholson <bnichols@ca.afilias.info> wrote:
> On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote:
>
> > There were a couple of things we noted.
> > 1. Tablesize twice as much than oracle-- Im not sure if postgres null
> > columns has any overhead since  we have lots of null columns in our
> > tables.Does postgresql has lots of overhead for null columns?
>
> Did you by any chance have an aborted load of the data?  If you load in
> a table, and that load fails or does not commit, it will still occupy
> the space until you vacuum.  If you try to load again, the table will be
> twice the size.
>
> If you want to compact the physical space the table occupies, you can
> try running VACUUM FULL on it, and possibly a redindex afterwards.  This
> will bring the physical space down to the minimum.  Both of these
> operations will lock out access to the tables though.
I ran vacuum full on this table already. I haven't re-indexed it. But
this will not affect the table size...right...since indexes are stored
separately?
> > 2. Oracle seems to be reading larger bocks than postgresql (when we
> > examined the iostat and vmstat) (we had set postgres' db block size as
> > 8 and oracle's is 16kb...)
> > Do you have any comments on this?
>
> 8k is the defualt.  You can change the block size if you need to.  You
> need to modify src/include/pg_config_manual.h recompile and re-initdb.
Does changing the block size has any side effects on any other
operations in particular?

> Brad Nicholson  416-673-4106
> Database Administrator, Afilias Canada Corp.
>
>
>

Re: postgresql storage and performance questions

From
Brad Nicholson
Date:
On Tue, 2007-11-20 at 13:04 -0500, Josh Harrison wrote:
> On Nov 20, 2007 11:13 AM, Brad Nicholson <bnichols@ca.afilias.info> wrote:
> > On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote:
> >
> > > There were a couple of things we noted.
> > > 1. Tablesize twice as much than oracle-- Im not sure if postgres null
> > > columns has any overhead since  we have lots of null columns in our
> > > tables.Does postgresql has lots of overhead for null columns?
> >
> > Did you by any chance have an aborted load of the data?  If you load in
> > a table, and that load fails or does not commit, it will still occupy
> > the space until you vacuum.  If you try to load again, the table will be
> > twice the size.
> >
> > If you want to compact the physical space the table occupies, you can
> > try running VACUUM FULL on it, and possibly a redindex afterwards.  This
> > will bring the physical space down to the minimum.  Both of these
> > operations will lock out access to the tables though.
> I ran vacuum full on this table already. I haven't re-indexed it. But
> this will not affect the table size...right...since indexes are stored
> separately?

You are correct about the indexes.


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: postgresql storage and performance questions

From
Vivek Khera
Date:
On Nov 20, 2007, at 1:04 PM, Josh Harrison wrote:

> I ran vacuum full on this table already. I haven't re-indexed it. But
> this will not affect the table size...right...since indexes are stored
> separately?
>>

Yes, but your indexes are probably bloated at this point, so to reduce
the space they use run a reindex.