Large databases, performance - Mailing list pgsql-hackers
| From | Shridhar Daithankar |
|---|---|
| Subject | Large databases, performance |
| Date | |
| Msg-id | 3D9C8712.9513.9C6521D@localhost Whole thread Raw |
| Responses |
Re: Large databases, performance
Re: [GENERAL] Large databases, performance |
| List | pgsql-hackers |
Hi,
Today we concluded test for database performance. Attached are results and the
schema, for those who have missed earlier discussion on this.
We have (almost) decided that we will partition the data across machines. The
theme is, after every some short interval a burst of data will be entered in
new table in database, indexed and vacuume. The table(s) will be inherited so
that query on base table will fetch results from all the children. The
application has to consolidate all the data per node basis. If the database is
not postgresql, app. has to consolidate data across partitions as well.
Now we need to investigate whether selecting on base table to include children
would use indexes created on children table.
It's estimated that when entire data is gathered, total number of children
tables would be around 1K-1.1K across all machines.
This is in point of average rate of data insertion i.e. 5K records/sec and
total data size, estimated to be 9 billion rows max i.e. estimated database
size is 900GB. Obviously it's impossible to keep insertion rate on an indexed
table high as data grows. So partitioning/inheritance looks better approach.
Postgresql is not the final winner as yet. Mysql is in close range. I will keep
you guys posted about the result.
Let me know about any comments..
Bye
Shridhar
--
Price's Advice: It's all a game -- play it to have fun.
Machine
Compaq Proliant Server ML 530
"Intel Xeon 2.4 Ghz Processor x 4, "
"4 GB RAM, 5 x 72.8 GB SCSI HDD "
"RAID 0 (Striping) Hardware Setup, Mandrake Linux 9.0"
"Cost - $13,500 ($1,350 for each additional 72GB HDD)"
Performance Parameter MySQL 3.23.52 MySQL 3.23.52 PostgreSQL 7.2.2
WITHOUT InnoDB WITH InnoDB for with built-in support
for transactional transactional support for transactions
support
Complete Data
Inserts + building a composite index
"40 GB data, 432,000,000 tuples" 3738 secs 18720 secs 20628 secs
"about 100 bytes each, schema on
'schema' sheet"
"composite index on 3 fields
(esn, min, datetime)"
Load Speed 115570 tuples/second 23076 tuples/second 20942 tuples/second
Database Size on Disk 48 GB 87 GB 111 GB
Average per partition
Inserts + building a composite index
"300MB data, 3,000,000 tuples," 28 secs 130 secs 150 secs
"about 100 bytes each, schema on
'schema' sheet"
"composite index on 3 fields
(esn, min, datetime)"
Select Query 7 secs 7 secs 6 secs
based on equality match of 2 fields
(esn and min) - 4 concurrent queries
running
Database Size on Disk 341 MB 619 MB 788 MB
Field Name Field Type Nullable Indexed
type int no no
esn char (10) no yes
min char (10) no yes
datetime timestamp no yes
opc0 char (3) no no
opc1 char (3) no no
opc2 char (3) no no
dpc0 char (3) no no
dpc1 char (3) no no
dpc2 char (3) no no
npa char (3) no no
nxx char (3) no no
rest char (4) no no
field0 int yes no
field1 char (4) yes no
field2 int yes no
field3 char (4) yes no
field4 int yes no
field5 char (4) yes no
field6 int yes no
field7 char (4) yes no
field8 int yes no
field9 char (4) yes no
pgsql-hackers by date: