We are designing a quite big application that requires a high-performance database backend. The rates we need to obtain are at least 5000 inserts per second and 15 selects per second for one connection. There should only be 3 or 4 simultaneous connections. I think our main concern is to deal with the constant flow of data coming from the inserts that must be available for selection as fast as possible. (kind of real time access ...)
As a consequence, the database should rapidly increase up to more than one hundred gigs. We still have to determine how and when we shoud backup old data to prevent the application from a performance drop. We intend to develop some kind of real-time partionning on our main table keep the flows up.
At first, we were planning to use SQL Server as it has features that in my opinion could help us a lot : - replication - clustering
Recently we started to study Postgresql as a solution for our project : - it also has replication - Postgis module can handle geographic datatypes (which would facilitate our developments) - We do have a strong knowledge on Postgresql administration (we use it for production processes) - it is free (!) and we could save money for hardware purchase.
Is SQL server clustering a real asset ? How reliable are Postgresql replication tools ? Should I trust Postgresql performance for this kind of needs ?
My question is a bit fuzzy but any advices are most welcome... hardware,tuning or design tips as well :))