Thread: I need your help to get opinions about this situation
Hello to everybody, I have this situation that I would to know your opinions about it, to confirm some elements that secure me of use the amazing database system PostgreSQL. I have a database of 1000 tables, 300 of theirs are of major growing with 10000 rows daily, the estimate growing for this database is of 2,6 TB every year. There are accessing 5000 clients to this database of which will be accessed 500 concurrent clients at the same time. There are the questions: 1. Is capable PostgreSQL to support this workload? Some examples better than this. 2. It is a recommendation to use a cluster with load balancer and replication for this situation? Which tools are recommended for this purpose? 3. Which are the hardware recommendations to deploy on servers? CPU, RAM memory capacity, Hard disk capacity and type of RAID system recommended to use among others like Operating System and network connection speed. Greetings and thanks a lot.
I'll leave the "can/cannot" responses to those more familiar with high-load/use situations but I am curious; what reasons other than cost are leading you to discontinue using your current database engine? With that many entities any migration is likely to be quite challenging even if you restricted initial development to standard SQL. My estimate is that it is possible that PostgreSQL would meet your needs - though as you say the use of various tools to connection pool and such are going to be critical. As for recommendations - buy as much hardware as you can afford. David J. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rayner Julio Rodríguez Pimentel Sent: Wednesday, March 02, 2011 10:41 AM To: pgsql-general@postgresql.org Subject: [GENERAL] I need your help to get opinions about this situation Hello to everybody, I have this situation that I would to know your opinions about it, to confirm some elements that secure me of use the amazing database system PostgreSQL. I have a database of 1000 tables, 300 of theirs are of major growing with 10000 rows daily, the estimate growing for this database is of 2,6 TB every year. There are accessing 5000 clients to this database of which will be accessed 500 concurrent clients at the same time. There are the questions: 1. Is capable PostgreSQL to support this workload? Some examples better than this. 2. It is a recommendation to use a cluster with load balancer and replication for this situation? Which tools are recommended for this purpose? 3. Which are the hardware recommendations to deploy on servers? CPU, RAM memory capacity, Hard disk capacity and type of RAID system recommended to use among others like Operating System and network connection speed. Greetings and thanks a lot. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Rayner -- <...> > I have a database of 1000 tables, 300 of theirs are of major growing > with 10000 rows daily, the estimate growing for this database is of > 2,6 TB every year. In and of-itself sheer number of rows only hits you when you need to be reading most of them; in that case good hardware (lots of spindles!) would be needed for any database. > There are accessing 5000 clients to this database of which will be > accessed 500 concurrent clients at the same time. That could be too many to handle natively; investigate pgPool and similar tools. > There are the questions: > 1. Is capable PostgreSQL to support this workload? Some examples > better than this. Depends on the native hardware and the types of queries. > 2. It is a recommendation to use a cluster with load balancer and > replication for this situation? Which tools are recommended for this > purpose? Depends on what you mean -- there is no multimaster solution in postgreSQL as far as I know, but if you only need one central servers and R/O slaves there are several possible solutions (Slony as an add-on as well as the new capabilities in the engine itself. > 3. Which are the hardware recommendations to deploy on servers? CPU, > RAM memory capacity, Hard disk capacity and type of RAID system > recommended to use among others like Operating System and network > connection speed. RAID-5 is generally a bad choice for databases. The specific answers to these questions need more info on workload, etc. I migrated a fairly large Informix system to postgres a few years ago and the main issues had to do with postGIS vs. Informix Spatial Blade; the core tables converted cleanly; the users and permissions were also easy. We needed to use pgPool to get the same number of connections. This was actually a migration -- from Sun Solaris to Linux so comparing the two directly wasn't easy. We moved "chunks" on the application and tested a lot; spatial data first and the bookkeeping and accounting functions and finally the warehouse and large-but-infrequent jobs. HTH, Greg Williamson
Re: I need your help to get opinions about this situation
From
Rayner Julio Rodríguez Pimentel
Date:
Thanks to David and Greg, your responses helped me so much to organize my ideas. I'm agree with your opinions. The problem is that I have to build a solid data architecture for a comercial system that will have many reading queries and in some peak times many clients executing their. Thanks again. 2011/3/3, Greg Williamson <gwilliamson39@yahoo.com>: > Rayner -- > > > > <...> >> I have a database of 1000 tables, 300 of theirs are of major growing >> with 10000 rows daily, the estimate growing for this database is of >> 2,6 TB every year. > > In and of-itself sheer number of rows only hits you when you need to be > reading most of them; in that case good hardware (lots of spindles!) would > be needed for any database. > >> There are accessing 5000 clients to this database of which will be >> accessed 500 concurrent clients at the same time. > > That could be too many to handle natively; investigate pgPool and similar > tools. > >> There are the questions: >> 1. Is capable PostgreSQL to support this workload? Some examples >> better than this. > > Depends on the native hardware and the types of queries. > >> 2. It is a recommendation to use a cluster with load balancer and >> replication for this situation? Which tools are recommended for this >> purpose? > > Depends on what you mean -- there is no multimaster solution in postgreSQL > as far as I know, but if you only need one central servers and R/O slaves > there > are several possible solutions (Slony as an add-on as well as the new > capabilities > in the engine itself. > >> 3. Which are the hardware recommendations to deploy on servers? CPU, >> RAM memory capacity, Hard disk capacity and type of RAID system >> recommended to use among others like Operating System and network >> connection speed. > > RAID-5 is generally a bad choice for databases. The specific answers to > these > questions > need more info on workload, etc. > > I migrated a fairly large Informix system to postgres a few years ago and > the > main issues > had to do with postGIS vs. Informix Spatial Blade; the core tables converted > cleanly; the > users and permissions were also easy. We needed to use pgPool to get the > same > number > of connections. This was actually a migration -- from Sun Solaris to Linux > so > comparing > the two directly wasn't easy. > > We moved "chunks" on the application and tested a lot; spatial data first > and > the bookkeeping > and accounting functions and finally the warehouse and large-but-infrequent > jobs. > > HTH, > > Greg Williamson > > > >