Thread: Hi Community

Hi Community

From
Naresh Soni
Date:
Good Evening Community

This is my first question on the list, I wanted to ask if postgres can handle multi millions records?  for example there will be  1 million records per table per day, so 365 millions per year.

Is yes, then please elaborate. 

Thank you for your help ! 



--
Regards,
Naresh Soni




Re: Hi Community

From
Kevin Grittner
Date:
Naresh Soni <jmnaresh@gmail.com> wrote:

> This is my first question on the list, I wanted to ask if
> postgres can handle multi millions records? for example there
> will be 1 million records per table per day, so 365 millions per
> year.

Yes, I have had hundreds of millions of rows in a table without
performance problems. If you want to see such a table in action,
go to the following web site, bring up a court case, and click the
"Court Record Events" button. Last I knew the table containing
court record events had about 450 million rows, with no
partitioning.  The total database was 3.5 TB.

http://wcca.wicourts.gov/

> Is yes, then please elaborate.

You will want indexes on columns used in the searches. Depending
on details you have not provided it might be beneficial to
partition the table. Do not consider partitioning to be some
special magic which always makes things faster, though -- it can
easily make performance much worse if it is not a good fit.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Hi Community

From
Naresh Soni
Date:

Hi Kevin,

Thanks for your response.

So you mean postgres can handle such huge records without any fine tuning required on postgres, by default  execpt we will need to use indexing for searching.

On 02-Feb-2015 8:03 PM, "Kevin Grittner" <kgrittn@ymail.com> wrote:
Naresh Soni <jmnaresh@gmail.com> wrote:

> This is my first question on the list, I wanted to ask if
> postgres can handle multi millions records? for example there
> will be 1 million records per table per day, so 365 millions per
> year.

Yes, I have had hundreds of millions of rows in a table without
performance problems. If you want to see such a table in action,
go to the following web site, bring up a court case, and click the
"Court Record Events" button. Last I knew the table containing
court record events had about 450 million rows, with no
partitioning.  The total database was 3.5 TB.

http://wcca.wicourts.gov/

> Is yes, then please elaborate.

You will want indexes on columns used in the searches. Depending
on details you have not provided it might be beneficial to
partition the table. Do not consider partitioning to be some
special magic which always makes things faster, though -- it can
easily make performance much worse if it is not a good fit.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Hi Community

From
Scott Ribe
Date:
On Feb 2, 2015, at 8:14 AM, Naresh Soni <jmnaresh@gmail.com> wrote:
>
> ...without any fine tuning required on postgres...

No. What hardware you need and how you tune will depend greatly on your particular access patterns for this data.
PostgreSQLcan handle a db that size, but you'll have to set it up correctly for good performance. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: Hi Community

From
Kevin Grittner
Date:
Naresh Soni <jmnaresh@gmail.com> wrote:

> So you mean postgres can handle such huge records without any
> fine tuning required on postgres

For any non-trivial database you will certainly need to make
adjustments to the configuration files, and probably to the OS,
too.  You may want to pay for some advice from one of the companies
with people experienced in tuning PostgreSQL, but you can get a
good start on things with the advice on this Wiki page:

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

You should also familiarize yourself with the PostgreSQL
documentation on the topic:

http://www.postgresql.org/docs/current/interactive/runtime-config.html

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Hi Community

From
Craig James
Date:

On Mon, Feb 2, 2015 at 3:18 AM, Naresh Soni <jmnaresh@gmail.com> wrote:
Good Evening Community

This is my first question on the list, I wanted to ask if postgres can handle multi millions records?  for example there will be  1 million records per table per day, so 365 millions per year.

Is yes, then please elaborate.

You should read this book. It's more detailed than you need at this point, but will give you an idea what is possible:


Craig
 

Thank you for your help ! 



--
Regards,
Naresh Soni







--
---------------------------------
Craig A. James
Chief Technology Officer
eMolecules, Inc.
---------------------------------

Re: Hi Community

From
Steve Crawford
Date:
On 02/02/2015 03:18 AM, Naresh Soni wrote:
Good Evening Community

This is my first question on the list, I wanted to ask if postgres can handle multi millions records?  for example there will be  1 million records per table per day, so 365 millions per year.

Is yes, then please elaborate. 


Yes. Maybe. But you will need to elaborate. (A plain text file can store millions of records, too. The use pattern is what matters.)

For reference, I have a modest database of about 350-million records running on modest hardware of a several-year-old 4-core/24GB RAM/4-drive RAID10 with battery-backup server and get mostly acceptable performance with query peaks well over 1,000 TPS at about 2/3 writes.

The people here can give you lots of good advice with purchasing the PostgreSQL Performance book being a good one. PostgreSQL is feature-rich but you will need to design and choose appropriately. To get started a few questions jump out:

  • You say 1-million/day/table. How many tables (i.e. what is the total record count)?

  • How long do you want to keep the data (total anticipated database size)?

  • Can the data be easily partitioned? For example, can you set up child tables by month for query and archive efficiency?

  • What sort of query mix do you anticipate? What portion of inserts, updates, selects,...? Lots of data-warehouse type queries?

  • Is the schema fairly simple or are there lots of complex foreign-key constraints?

  • What sort of performance do you need - fraction-of-a-second transactional response or overnight analytical runs?

  • What hardware can you throw at it? At one end I've run PostgreSQL on a Raspberry Pi. At the other end, I'm about to migrate a database to a 20-core/128GB machine with Intel SSD storage that can purportedly match the TPS of 2,000 15k drives. (This is certainly not the upper limit of hardware for running PostgreSQL). Each is appropriate for its use-case.

In any case, don't let the number of options and features overwhelm you - you don't need to optimize prematurely. Describe your use-case here for some starting suggestions, get familiar with it, and adjust your design and settings as necessary.

Welcome to the list.

Cheers,
Steve

Re: Hi Community

From
Naresh Soni
Date:
Hi Everyone, 

Thank you all for your reply ! I really appreciate it and be pleased to be a part of such a great community of enthusiast people.

 @steve: I replied inline for your questions.

On 3 February 2015 at 00:57, Steve Crawford <scrawford@pinpointresearch.com> wrote:
On 02/02/2015 03:18 AM, Naresh Soni wrote:
Good Evening Community

This is my first question on the list, I wanted to ask if postgres can handle multi millions records?  for example there will be  1 million records per table per day, so 365 millions per year.

Is yes, then please elaborate. 


Yes. Maybe. But you will need to elaborate. (A plain text file can store millions of records, too. The use pattern is what matters.)

For reference, I have a modest database of about 350-million records running on modest hardware of a several-year-old 4-core/24GB RAM/4-drive RAID10 with battery-backup server and get mostly acceptable performance with query peaks well over 1,000 TPS at about 2/3 writes.

The people here can give you lots of good advice with purchasing the PostgreSQL Performance book being a good one. PostgreSQL is feature-rich but you will need to design and choose appropriately. To get started a few questions jump out:

  • You say 1-million/day/table. How many tables (i.e. what is the total record count)?
This is only 1 table in the database which will have such huge data i.e 1 million data per day .  no doubt there are other tables in the database but the records are limited.

  • How long do you want to keep the data (total anticipated database size)?

I plan to use new database every year. 
  • Can the data be easily partitioned? For example, can you set up child tables by month for query and archive efficiency?

Don't  think so. 
  • What sort of query mix do you anticipate? What portion of inserts, updates, selects,...? Lots of data-warehouse type queries?

Yes, there will be complex queries as this table will be used for reporting purpose. In short this table will contain Financial Invoices. 
  • Is the schema fairly simple or are there lots of complex foreign-key constraints?
This table has 1 primary and 15 foreign key constraints.

  • What sort of performance do you need - fraction-of-a-second transactional response or overnight analytical runs?
A good response time expected when data is being fetched. 

  • What hardware can you throw at it? At one end I've run PostgreSQL on a Raspberry Pi. At the other end, I'm about to migrate a database to a 20-core/128GB machine with Intel SSD storage that can purportedly match the TPS of 2,000 15k drives. (This is certainly not the upper limit of hardware for running PostgreSQL). Each is appropriate for its use-case.


Not planned yet.
 
In any case, don't let the number of options and features overwhelm you - you don't need to optimize prematurely. Describe your use-case here for some starting suggestions, get familiar with it, and adjust your design and settings as necessary.

Welcome to the list.

Thank you so much for your help !
 
Cheers,
Steve




--
Regards,
Naresh Soni




Re: Hi Community

From
Scott Marlowe
Date:
On Mon, Feb 2, 2015 at 10:14 PM, Naresh Soni <jmnaresh@gmail.com> wrote:
>
> This is only 1 table in the database which will have such huge data i.e 1
> million data per day .  no doubt there are other tables in the database but
> the records are limited.

>> Can the data be easily partitioned? For example, can you set up child
>> tables by month for query and archive efficiency?
>>
> Don't  think so.
>>
>> What sort of query mix do you anticipate? What portion of inserts,
>> updates, selects,...? Lots of data-warehouse type queries?
>>
> Yes, there will be complex queries as this table will be used for reporting
> purpose. In short this table will contain Financial Invoices.
>>
>> Is the schema fairly simple or are there lots of complex foreign-key
>> constraints?
>
> This table has 1 primary and 15 foreign key constraints.

I would tend to question your db design here. This layout is setting
off alarms for me. It sounds like a "garbage pail" table where you
just chuck a bunch of semi-related material without a proper schema
for it. That's a recipe for disaster.