Thread: postgresql storage and performance questions
Hi, I have a few questions about the storage and performance 1. How do you estimate the table size in postgresql? For example if I have a table 'Dummy' with 1 varchar (40) & 1 numeric(22,0) fields and 1000 rows, what is the tablesize estimate for this (including the row overhead etc)? How many pages will this occupy? 2. Also if the table contains null columns, does postgres allocates the same space for these nulls columns? How does it handle 'nulls' in terms of storage? 3. How does oracle handle these 2 cases? 4. Does increasing the block size in postgres improve query performance? Thanks in advance Josh
2007/11/19, Josh Harrison <joshques@gmail.com>: > Hi, > I have a few questions about the storage and performance > > 1. How do you estimate the table size in postgresql? > For example if I have a table 'Dummy' with 1 varchar (40) & 1 > numeric(22,0) fields and 1000 rows, what is the tablesize estimate for > this (including the row overhead etc)? How many pages will this > occupy? > > 2. Also if the table contains null columns, does postgres allocates > the same space for these nulls columns? How does it handle 'nulls' in > terms of storage? Try these: http://www.postgresql.org/docs/current/static/datatype-numeric.html http://www.postgresql.org/docs/current/static/datatype-character.html http://www.postgresql.org/docs/current/static/storage-page-layout.html ad.1 ) There is a fixed-size header (occupying 27 bytes on most machines) for each tuple so you will have about 27 + 1 + varchar data + numeric data per row, plus some overhaed for block headers ad.2 ) there is a null bitmap for each tuple which has nullable fields - so every 8 NULLable columns occupy one byte bitmap. PS. why do you post same thing many times? this is kinda.. spam? -- Filip Rembiałkowski
Thanks Filip. I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ). In table1 both the cols are filled and in table2 the varchar colm is null So when I checked the tablesize for these two tables (using pg_relation_size) table1 - 57344 bytes (no null columns) table2 - 49152 bytes (varchar colm is null) There is not much difference between the two sizes.So even if a column is null postgresql still has lots of overhead. Does postgres occupy space even when the column is NULL? This is not a spam.... I posted it twice becoz my question didnot show up the first time in the mailing list even after 30 minutes. So i tried again and then both showed up...kind of strange though! Thanks again Josh On Nov 19, 2007 1:37 PM, Filip Rembiałkowski <plk.zuber@gmail.com> wrote: > 2007/11/19, Josh Harrison <joshques@gmail.com>: > > Hi, > > I have a few questions about the storage and performance > > > > 1. How do you estimate the table size in postgresql? > > For example if I have a table 'Dummy' with 1 varchar (40) & 1 > > numeric(22,0) fields and 1000 rows, what is the tablesize estimate for > > this (including the row overhead etc)? How many pages will this > > occupy? > > > > 2. Also if the table contains null columns, does postgres allocates > > the same space for these nulls columns? How does it handle 'nulls' in > > terms of storage? > > Try these: > http://www.postgresql.org/docs/current/static/datatype-numeric.html > http://www.postgresql.org/docs/current/static/datatype-character.html > http://www.postgresql.org/docs/current/static/storage-page-layout.html > > ad.1 ) There is a fixed-size header (occupying 27 bytes on most > machines) for each tuple > > so you will have about 27 + 1 + varchar data + numeric data per row, > plus some overhaed for block headers > > ad.2 ) there is a null bitmap for each tuple which has nullable fields > - so every 8 NULLable columns occupy one byte bitmap. > > > PS. why do you post same thing many times? this is kinda.. spam? > > -- > Filip Rembiałkowski >
On 11/19/07, Josh Harrison <joshques@gmail.com> wrote: > I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ). > In table1 both the cols are filled and in table2 the varchar colm is null > > So when I checked the tablesize for these two tables (using pg_relation_size) > table1 - 57344 bytes (no null columns) > table2 - 49152 bytes (varchar colm is null) > > There is not much difference between the two sizes.So even if a column > is null postgresql still has lots of overhead. > Does postgres occupy space even when the column is NULL? PostgreSQL's disk storage works in "pages", where each page is 8KB. It will use as much space within each page as it can. Filip's last link details this. Is there a specific reason you're looking at this, as in you have some requirement to meet? Or just curious how it works?
We are working on migrating our database from oracle to postgres. Postgres tablesize is twice than oracle tablesize for all my tables.And so the query also takes twice as much time than oracle. So we were checking to see what makes postgres slower than oracle even for basic full tablescan queries. There were a couple of things we noted. 1. Tablesize twice as much than oracle-- Im not sure if postgres null columns has any overhead since we have lots of null columns in our tables.Does postgresql has lots of overhead for null columns? 2. Oracle seems to be reading larger bocks than postgresql (when we examined the iostat and vmstat) (we had set postgres' db block size as 8 and oracle's is 16kb...) Do you have any comments on this? Thanks in advance josh On Nov 20, 2007 12:37 AM, Trevor Talbot <quension@gmail.com> wrote: > On 11/19/07, Josh Harrison <joshques@gmail.com> wrote: > > > I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ). > > In table1 both the cols are filled and in table2 the varchar colm is null > > > > So when I checked the tablesize for these two tables (using pg_relation_size) > > table1 - 57344 bytes (no null columns) > > table2 - 49152 bytes (varchar colm is null) > > > > There is not much difference between the two sizes.So even if a column > > is null postgresql still has lots of overhead. > > Does postgres occupy space even when the column is NULL? > > PostgreSQL's disk storage works in "pages", where each page is 8KB. > It will use as much space within each page as it can. Filip's last > link details this. > > Is there a specific reason you're looking at this, as in you have some > requirement to meet? Or just curious how it works? > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Josh Harrison escribió: > > On 11/19/07, Josh Harrison <joshques@gmail.com> wrote: > > > > > I have 2 tables with 2 cols each( 1 numeric(8,0) and 1 varchar(3) ). > > > In table1 both the cols are filled and in table2 the varchar colm is null > There were a couple of things we noted. > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > columns has any overhead since we have lots of null columns in our > tables.Does postgresql has lots of overhead for null columns? No, NULLs are stored as a bitmap for each tuple and they are quite efficient. Probably the reason for the difference is the numeric field which Oracle may be optimizing as a plain integer. Did you try declaring the column as INTEGER in Postgres? Please do not top-post. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC "Everything that I think about is more fascinating than the crap in your head." (Dogbert's interpretation of blogger philosophy)
2007/11/20, Josh Harrison <joshques@gmail.com>: > We are working on migrating our database from oracle to postgres. > Postgres tablesize is twice than oracle tablesize for all my > tables. Interesting. Which postgresql version? >And so the query also takes twice as much time than oracle. This is even more interesting :) What query? can you show it here along with EXPLAIN ANALYZE? Did you do some index tuning or do you just expect the indexes ported from Oracle schema to work? Did you run ANALYZE after populating database? What are server parameters and did you tune postgres config to fit them? > So > we were checking to see what makes postgres slower than oracle even > for basic full tablescan queries. I'm curious too :) please let me know if you resolve this mystery :) > There were a couple of things we noted. > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > columns has any overhead since we have lots of null columns in our > tables.Does postgresql has lots of overhead for null columns? I've expained this previously - you have a bitmap in each tuple. Bitmap size is (NATTS+7) % 8 > 2. Oracle seems to be reading larger bocks than postgresql (when we > examined the iostat and vmstat) (we had set postgres' db block size as > 8 and oracle's is 16kb...) yes, 8 kB is default pg block size. it is not recommended to change it - however it could be useful in some situations - but I doubt it would speedup your queries twice, whatever they are. -- Filip Rembiałkowski
On Nov 20, 2007 8:10 AM, Filip Rembiałkowski <plk.zuber@gmail.com> wrote: > 2007/11/20, Josh Harrison <joshques@gmail.com>: > > We are working on migrating our database from oracle to postgres. > > Postgres tablesize is twice than oracle tablesize for all my > > tables. > Interesting. Which postgresql version? Version 8.2.3 > >And so the query also takes twice as much time than oracle. > This is even more interesting :) What query? can you show it here > along with EXPLAIN ANALYZE? explain analyze select count(*) from dummy1 QUERY PLAN ------------------------------- Aggregate (cost=1192999.60..1192999.61 rows=1 width=0) (actual time=109792.239..109792.239 rows=1 loops=1) -> Seq Scan on dummy1 (cost=0.00..1119539.48 rows=29384048 30000000 width=0) (actual time=0.027..101428.016 rows=29384048 loops=1) Total runtime: 109792.332 ms Postgresql takes 1m 40s for this query Oracle takes 45 sec It is just a count(*) query. I know count(*) query is slower in postgres becoz it doesn't use index. But in Oracle I give the query as select /*+full(dummy1)*/ count(*) from dummy1 with the hint so that oracle uses full table scan and not the index scan. > > Did you do some index tuning or do you just expect the indexes ported > from Oracle schema to work? I created the indexes and Im not sure what kind of tuning neds to be done for the indexes. But this above query doesnt use any indexes. > Did you run ANALYZE after populating database? Yes > What are server parameters and did you tune postgres config to fit them? I had attached my config file and the table structure > > So > > we were checking to see what makes postgres slower than oracle even > > for basic full tablescan queries. > I'm curious too :) please let me know if you resolve this mystery :) > > > > There were a couple of things we noted. > > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > > columns has any overhead since we have lots of null columns in our > > tables.Does postgresql has lots of overhead for null columns? > I've expained this previously - you have a bitmap in each tuple. > Bitmap size is (NATTS+7) % 8 > > > 2. Oracle seems to be reading larger bocks than postgresql (when we > > examined the iostat and vmstat) (we had set postgres' db block size as > > 8 and oracle's is 16kb...) > yes, 8 kB is default pg block size. it is not recommended to change it > - however it could be useful in some situations - but I doubt it would > speedup your queries twice, whatever they are. > > > > -- > Filip Rembiałkowski > Thanks again josh
Attachment
On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote: > There were a couple of things we noted. > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > columns has any overhead since we have lots of null columns in our > tables.Does postgresql has lots of overhead for null columns? Did you by any chance have an aborted load of the data? If you load in a table, and that load fails or does not commit, it will still occupy the space until you vacuum. If you try to load again, the table will be twice the size. If you want to compact the physical space the table occupies, you can try running VACUUM FULL on it, and possibly a redindex afterwards. This will bring the physical space down to the minimum. Both of these operations will lock out access to the tables though. > 2. Oracle seems to be reading larger bocks than postgresql (when we > examined the iostat and vmstat) (we had set postgres' db block size as > 8 and oracle's is 16kb...) > Do you have any comments on this? 8k is the defualt. You can change the block size if you need to. You need to modify src/include/pg_config_manual.h recompile and re-initdb. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
On 11/20/07, Josh Harrison <joshques@gmail.com> wrote: > We are working on migrating our database from oracle to postgres. > Postgres tablesize is twice than oracle tablesize for all my > tables.And so the query also takes twice as much time than oracle. So > we were checking to see what makes postgres slower than oracle even > for basic full tablescan queries. A similar question came up recently: http://archives.postgresql.org/pgsql-general/2007-11/msg00619.php You won't see anything that dramatic, but you might try to see how 8.3beta does with your test data.
On Nov 20, 2007 11:13 AM, Brad Nicholson <bnichols@ca.afilias.info> wrote: > On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote: > > > There were a couple of things we noted. > > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > > columns has any overhead since we have lots of null columns in our > > tables.Does postgresql has lots of overhead for null columns? > > Did you by any chance have an aborted load of the data? If you load in > a table, and that load fails or does not commit, it will still occupy > the space until you vacuum. If you try to load again, the table will be > twice the size. > > If you want to compact the physical space the table occupies, you can > try running VACUUM FULL on it, and possibly a redindex afterwards. This > will bring the physical space down to the minimum. Both of these > operations will lock out access to the tables though. I ran vacuum full on this table already. I haven't re-indexed it. But this will not affect the table size...right...since indexes are stored separately? > > 2. Oracle seems to be reading larger bocks than postgresql (when we > > examined the iostat and vmstat) (we had set postgres' db block size as > > 8 and oracle's is 16kb...) > > Do you have any comments on this? > > 8k is the defualt. You can change the block size if you need to. You > need to modify src/include/pg_config_manual.h recompile and re-initdb. Does changing the block size has any side effects on any other operations in particular? > Brad Nicholson 416-673-4106 > Database Administrator, Afilias Canada Corp. > > >
On Tue, 2007-11-20 at 13:04 -0500, Josh Harrison wrote: > On Nov 20, 2007 11:13 AM, Brad Nicholson <bnichols@ca.afilias.info> wrote: > > On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote: > > > > > There were a couple of things we noted. > > > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > > > columns has any overhead since we have lots of null columns in our > > > tables.Does postgresql has lots of overhead for null columns? > > > > Did you by any chance have an aborted load of the data? If you load in > > a table, and that load fails or does not commit, it will still occupy > > the space until you vacuum. If you try to load again, the table will be > > twice the size. > > > > If you want to compact the physical space the table occupies, you can > > try running VACUUM FULL on it, and possibly a redindex afterwards. This > > will bring the physical space down to the minimum. Both of these > > operations will lock out access to the tables though. > I ran vacuum full on this table already. I haven't re-indexed it. But > this will not affect the table size...right...since indexes are stored > separately? You are correct about the indexes. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
On Nov 20, 2007, at 1:04 PM, Josh Harrison wrote: > I ran vacuum full on this table already. I haven't re-indexed it. But > this will not affect the table size...right...since indexes are stored > separately? >> Yes, but your indexes are probably bloated at this point, so to reduce the space they use run a reindex.