Thread: is postgres a good solution for billion record data

is postgres a good solution for billion record data

From
shahrzad khorrami
Date:
is postgres a good solution for billion record data, think of 300kb data insert into db at each minutes, I'm coding with php
what do you recommend to manage these data?

--
Shahrzad Khorrami

Re: is postgres a good solution for billion record data

From
Scott Marlowe
Date:
On Sat, Oct 24, 2009 at 7:32 AM, shahrzad khorrami
<shahrzad.khorrami@gmail.com> wrote:
> is postgres a good solution for billion record data, think of 300kb data
> insert into db at each minutes, I'm coding with php
> what do you recommend to manage these data?

You'll want a server with LOTS of hard drives spinning under it.  Fast
RAID controller with battery backed RAM.  Inserting the data is no
problem. 300kb a minute is nothing.  My stats machine that handles
about 2.5M rows a day during the week is inserting in the megabytes
per second (it's also the search database so there's the indexer wtih
16 threads hitting it).  The stats part of the load is miniscule until
you start retrieving large chunks of data, then it's mostly sequential
reads in the 100+Megs a second.

The more drives and the better the RAID controller you throw at the
problem the better performance you'll get.  For the price of one
oracle license for one core, you can build a damned find pgsql server
or pair of servers.

Re: is postgres a good solution for billion record data

From
Scott Marlowe
Date:
On Sat, Oct 24, 2009 at 2:43 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sat, Oct 24, 2009 at 7:32 AM, shahrzad khorrami
> <shahrzad.khorrami@gmail.com> wrote:
>> is postgres a good solution for billion record data, think of 300kb data
>> insert into db at each minutes, I'm coding with php
>> what do you recommend to manage these data?
>
> You'll want a server with LOTS of hard drives spinning under it.  Fast
> RAID controller with battery backed RAM.  Inserting the data is no
> problem. 300kb a minute is nothing.  My stats machine that handles
> about 2.5M rows a day during the week is inserting in the megabytes
> per second (it's also the search database so there's the indexer wtih
> 16 threads hitting it).  The stats part of the load is miniscule until
> you start retrieving large chunks of data, then it's mostly sequential
> reads in the 100+Megs a second.
>
> The more drives and the better the RAID controller you throw at the
> problem the better performance you'll get.  For the price of one
> oracle license for one core, you can build a damned find pgsql server
> or pair of servers.

Quick reference, you get one of these:

http://www.aberdeeninc.com/abcatg/Stirling-X888.htm

with dual 2.26GHz Nehalem CPUs, 48 Gigs ram, and 48 73K 15kRPM Seagate
barracudas for around $20,000.  That's the same cost for a single
oracle license for one CPU.  That's way overkill for what you're
talking about doing.  A machine with 8 or 16 disks could easily handle
the load you're talking about.