Re: 10 TB database - Mailing list pgsql-general

From Greg Smith
Subject Re: 10 TB database
Date
Msg-id alpine.GSO.2.01.0906161519400.120@westnet.com
Whole thread Raw
In response to Re: 10 TB database  (Michelle Konzack <linux4michelle@tamay-dogan.net>)
List pgsql-general
On Tue, 16 Jun 2009, Michelle Konzack wrote:

> Am 2009-06-16 12:13:20, schrieb Greg Smith:
>> you'll be hard pressed to keep up with 250GB/day unless you write a
>> custom data loader that keeps multiple cores
>
> AFAIK he was talking about 250 GByte/month which are  around  8 GByte  a
> day or 300 MByte per hour

Right, that was just a typo in my response, the comments reflected what he
meant.  Note that your averages here presume you can spread that out over
a full 24 hour period--which you often can't, as this type of data tends
to come in a big clump after market close and needs to be loaded ASAP for
it to be useful.

It's harder than most people would guess to sustain that sort of rate
against real-world data (which even fails to import some days) in
PostgreSQL without running into a bottleneck in COPY, WAL traffic, or
database disk I/O (particularly if there's any random access stuff going
on concurrently with the load).  Just because your RAID array can write at
hundreds of MB/s does not mean you'll be able to sustain anywhere close to
that during your loading.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-general by date:

Previous
From: Andy Colson
Date:
Subject: Re: ResultSet is FORWARD_ONLY.
Next
From: Grzegorz Jaśkiewicz
Date:
Subject: Re: 10 TB database