Thread: I need your help to get opinions about this situation

I need your help to get opinions about this situation

From
Rayner Julio Rodríguez Pimentel
Date:
Hello to everybody,
I have this situation that I would to know your opinions about it, to
confirm some elements that secure me of use the amazing database
system PostgreSQL.
I have a database of 1000 tables, 300 of theirs are of major growing
with 10000 rows daily, the estimate growing for this database is of
2,6 TB every year.
There are accessing 5000 clients to this database of which will be
accessed 500 concurrent clients at the same time.
There are the questions:
1.    Is capable PostgreSQL to support this workload? Some examples
better than this.
2.    It is a recommendation to use a cluster with load balancer and
replication for this situation? Which tools are recommended for this
purpose?
3.    Which are the hardware recommendations to deploy on servers? CPU,
RAM memory capacity, Hard disk capacity and type of RAID system
recommended to use among others like Operating System and network
connection speed.
Greetings and thanks a lot.

Re: I need your help to get opinions about this situation

From
"David Johnston"
Date:
I'll leave the "can/cannot" responses to those more familiar with
high-load/use situations but I am curious;  what reasons other than cost are
leading you to discontinue using your current database engine?  With that
many entities any migration is likely to be quite challenging even if you
restricted initial development to standard SQL.

My estimate is that it is possible that PostgreSQL would meet your needs -
though as you say the use of various tools to connection pool and such are
going to be critical.

As for recommendations - buy as much hardware as you can afford.

David J.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rayner Julio
Rodríguez Pimentel
Sent: Wednesday, March 02, 2011 10:41 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] I need your help to get opinions about this situation

Hello to everybody,
I have this situation that I would to know your opinions about it, to
confirm some elements that secure me of use the amazing database system
PostgreSQL.
I have a database of 1000 tables, 300 of theirs are of major growing with
10000 rows daily, the estimate growing for this database is of
2,6 TB every year.
There are accessing 5000 clients to this database of which will be accessed
500 concurrent clients at the same time.
There are the questions:
1.    Is capable PostgreSQL to support this workload? Some examples
better than this.
2.    It is a recommendation to use a cluster with load balancer and
replication for this situation? Which tools are recommended for this
purpose?
3.    Which are the hardware recommendations to deploy on servers? CPU,
RAM memory capacity, Hard disk capacity and type of RAID system recommended
to use among others like Operating System and network connection speed.
Greetings and thanks a lot.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: I need your help to get opinions about this situation

From
Greg Williamson
Date:
Rayner --



<...>
> I have a database of 1000 tables, 300 of theirs are of major growing
> with 10000 rows daily, the estimate growing for this database is of
> 2,6 TB every year.

In and of-itself sheer number of rows only hits you when you need to be
reading most of them; in that case good hardware (lots of spindles!) would
be needed for any database.

> There are accessing 5000 clients to this database of which will be
> accessed 500 concurrent clients at the same time.

That could be too many to handle natively; investigate pgPool and similar tools.

> There are the questions:
> 1.    Is capable PostgreSQL to support this workload? Some examples
> better than this.

Depends on the native hardware and the types of queries.

> 2.    It is a recommendation to use a cluster with load balancer and
> replication for this situation? Which tools are recommended for this
> purpose?

Depends on what you mean -- there is no multimaster solution in postgreSQL
as far as I know, but if you only need one central servers and R/O slaves there
are several possible solutions (Slony as an add-on as well as the new
capabilities
in the engine itself.

> 3.    Which are the hardware recommendations to deploy on servers? CPU,
> RAM memory capacity, Hard disk capacity and type of RAID system
> recommended to use among others like Operating System and network
> connection speed.

 RAID-5 is generally a bad choice for databases. The specific answers to these
questions
need more info on workload, etc.

I migrated a fairly large Informix system to postgres a few years ago and the
main issues
had to do with postGIS vs. Informix Spatial Blade; the core tables converted
cleanly; the
users and permissions were also easy. We needed to use pgPool to get the same
number
of connections. This was actually a migration -- from Sun Solaris to Linux so
comparing
the two directly wasn't easy.

We moved "chunks" on the application and tested a lot; spatial data first and
the bookkeeping
and accounting functions and finally the warehouse and large-but-infrequent
jobs.

HTH,

Greg Williamson




Re: I need your help to get opinions about this situation

From
Rayner Julio Rodríguez Pimentel
Date:
Thanks to David and Greg, your responses helped me so much to organize
my ideas. I'm agree with your opinions. The problem is that I have to
build a solid data architecture for a comercial system that will have
many reading queries and in some peak times many clients executing
their.
Thanks again.

2011/3/3, Greg Williamson <gwilliamson39@yahoo.com>:
> Rayner --
>
>
>
> <...>
>> I have a database of 1000 tables, 300 of theirs are of major growing
>> with 10000 rows daily, the estimate growing for this database is of
>> 2,6 TB every year.
>
> In and of-itself sheer number of rows only hits you when you need to be
> reading most of them; in that case good hardware (lots of spindles!) would
> be needed for any database.
>
>> There are accessing 5000 clients to this database of which will be
>> accessed 500 concurrent clients at the same time.
>
> That could be too many to handle natively; investigate pgPool and similar
> tools.
>
>> There are the questions:
>> 1.    Is capable PostgreSQL to support this workload? Some examples
>> better than this.
>
> Depends on the native hardware and the types of queries.
>
>> 2.    It is a recommendation to use a cluster with load balancer and
>> replication for this situation? Which tools are recommended for this
>> purpose?
>
> Depends on what you mean -- there is no multimaster solution in postgreSQL
> as far as I know, but if you only need one central servers and R/O slaves
> there
> are several possible solutions (Slony as an add-on as well as the new
> capabilities
> in the engine itself.
>
>> 3.    Which are the hardware recommendations to deploy on servers? CPU,
>> RAM memory capacity, Hard disk capacity and type of RAID system
>> recommended to use among others like Operating System and network
>> connection speed.
>
>  RAID-5 is generally a bad choice for databases. The specific answers to
> these
> questions
> need more info on workload, etc.
>
> I migrated a fairly large Informix system to postgres a few years ago and
> the
> main issues
> had to do with postGIS vs. Informix Spatial Blade; the core tables converted
> cleanly; the
> users and permissions were also easy. We needed to use pgPool to get the
> same
> number
> of connections. This was actually a migration -- from Sun Solaris to Linux
> so
> comparing
> the two directly wasn't easy.
>
> We moved "chunks" on the application and tested a lot; spatial data first
> and
> the bookkeeping
> and accounting functions and finally the warehouse and large-but-infrequent
> jobs.
>
> HTH,
>
> Greg Williamson
>
>
>
>