Large databases, performance - Mailing list pgsql-hackers
From | Shridhar Daithankar |
---|---|
Subject | Large databases, performance |
Date | |
Msg-id | 3D9C8712.9513.9C6521D@localhost Whole thread Raw |
Responses |
Re: Large databases, performance
Re: [GENERAL] Large databases, performance |
List | pgsql-hackers |
Hi, Today we concluded test for database performance. Attached are results and the schema, for those who have missed earlier discussion on this. We have (almost) decided that we will partition the data across machines. The theme is, after every some short interval a burst of data will be entered in new table in database, indexed and vacuume. The table(s) will be inherited so that query on base table will fetch results from all the children. The application has to consolidate all the data per node basis. If the database is not postgresql, app. has to consolidate data across partitions as well. Now we need to investigate whether selecting on base table to include children would use indexes created on children table. It's estimated that when entire data is gathered, total number of children tables would be around 1K-1.1K across all machines. This is in point of average rate of data insertion i.e. 5K records/sec and total data size, estimated to be 9 billion rows max i.e. estimated database size is 900GB. Obviously it's impossible to keep insertion rate on an indexed table high as data grows. So partitioning/inheritance looks better approach. Postgresql is not the final winner as yet. Mysql is in close range. I will keep you guys posted about the result. Let me know about any comments.. Bye Shridhar -- Price's Advice: It's all a game -- play it to have fun. Machine Compaq Proliant Server ML 530 "Intel Xeon 2.4 Ghz Processor x 4, " "4 GB RAM, 5 x 72.8 GB SCSI HDD " "RAID 0 (Striping) Hardware Setup, Mandrake Linux 9.0" "Cost - $13,500 ($1,350 for each additional 72GB HDD)" Performance Parameter MySQL 3.23.52 MySQL 3.23.52 PostgreSQL 7.2.2 WITHOUT InnoDB WITH InnoDB for with built-in support for transactional transactional support for transactions support Complete Data Inserts + building a composite index "40 GB data, 432,000,000 tuples" 3738 secs 18720 secs 20628 secs "about 100 bytes each, schema on 'schema' sheet" "composite index on 3 fields (esn, min, datetime)" Load Speed 115570 tuples/second 23076 tuples/second 20942 tuples/second Database Size on Disk 48 GB 87 GB 111 GB Average per partition Inserts + building a composite index "300MB data, 3,000,000 tuples," 28 secs 130 secs 150 secs "about 100 bytes each, schema on 'schema' sheet" "composite index on 3 fields (esn, min, datetime)" Select Query 7 secs 7 secs 6 secs based on equality match of 2 fields (esn and min) - 4 concurrent queries running Database Size on Disk 341 MB 619 MB 788 MB Field Name Field Type Nullable Indexed type int no no esn char (10) no yes min char (10) no yes datetime timestamp no yes opc0 char (3) no no opc1 char (3) no no opc2 char (3) no no dpc0 char (3) no no dpc1 char (3) no no dpc2 char (3) no no npa char (3) no no nxx char (3) no no rest char (4) no no field0 int yes no field1 char (4) yes no field2 int yes no field3 char (4) yes no field4 int yes no field5 char (4) yes no field6 int yes no field7 char (4) yes no field8 int yes no field9 char (4) yes no
pgsql-hackers by date: