Thread: Re: Performance large tables.

Re: Performance large tables.

From
"Roger Hand"
Date:
Benjamin Arai
wrote on Saturday, December 10, 2005 3:37 PM
> ... On the other hand there is a weekly update (This is the
> problem) that updates all of the modified records for a bunch of
> finacial data such as closes and etc.  For the most part they are
> records of the type name,date,value.  The update currently takes almost
> two days.   The update does deletions, insertion, and updates depending
> on what has happened from the previous week.
>
> For the most part the updates are simple one liners.  I currently commit
> in large batch to increase performance but it still takes a while as
> stated above.  From evaluating the computers performance during an
> update,  the system is thrashing both memory and disk.

I experimented with batch size and found that large batches (thousands or
tens of thousands) slowed things down in our situation, while using a
batch size of around 100 or so sped things up tremendously.
Of course, YMMV ...

-Roger

Re: Performance large tables.

From
Franz.Rasper@izb.de
Date:
Hello,

may I ask you some questions.

What is the performance difference between U320 15kRPM and U320 10kRPM ?
Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB )
and something like memory backup write cache (like HP DL 380 server) ?
Do you use Intel or Opteron cpus ?

regards,

-Franz

-----Ursprüngliche Nachricht-----
Von: Vivek Khera [mailto:vivek@khera.org]
Gesendet: Montag, 12. Dezember 2005 23:15
An: PG-General General
Betreff: Re: [GENERAL] Performance large tables.



On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote:

> For the most part the updates are simple one liners.  I currently
> commit in large batch to increase performance but it still takes a
> while as stated above.  From evaluating the computers performance
> during an update,  the system is thrashing both memory and disk.  I
> am currently using Postgresql 8.0.3.

Then buy faster disks.  My current favorite is to use U320 15kRPM
disks using a dual-chanel RAID controller with 1/2 the disks on one
channel and 1/2 on the other and mirroring them across channels, then
striping down the mirrors (ie, RAID10).

I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a
RAID1.


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: Performance large tables.

From
"Benjamin Arai"
Date:
What kind of performance boost do you get from using raid 10?  I am trying
to do a little cost analysis.

Benjamin Arai
barai@cs.ucr.edu
benjamin@cs.ucr.edu
http://www.benjaminarai.com



> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of
> Franz.Rasper@izb.de
> Sent: Monday, December 12, 2005 11:50 PM
> To: vivek@khera.org
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Performance large tables.
>
> Hello,
>
> may I ask you some questions.
>
> What is the performance difference between U320 15kRPM and
> U320 10kRPM ?
> Does your RAID crontoller has some memory (e.g. 128 MB or 256
> MB ) and something like memory backup write cache (like HP DL
> 380 server) ?
> Do you use Intel or Opteron cpus ?
>
> regards,
>
> -Franz
>
> -----Ursprüngliche Nachricht-----
> Von: Vivek Khera [mailto:vivek@khera.org]
> Gesendet: Montag, 12. Dezember 2005 23:15
> An: PG-General General
> Betreff: Re: [GENERAL] Performance large tables.
>
>
>
> On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote:
>
> > For the most part the updates are simple one liners.  I currently
> > commit in large batch to increase performance but it still takes a
> > while as stated above.  From evaluating the computers performance
> > during an update,  the system is thrashing both memory and
> disk.  I
> > am currently using Postgresql 8.0.3.
>
> Then buy faster disks.  My current favorite is to use U320 15kRPM
> disks using a dual-chanel RAID controller with 1/2 the disks on one
> channel and 1/2 on the other and mirroring them across
> channels, then
> striping down the mirrors (ie, RAID10).
>
> I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a
> RAID1.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: Performance large tables.

From
Vivek Khera
Date:
On Dec 13, 2005, at 2:49 AM, Franz.Rasper@izb.de wrote:

> What is the performance difference between U320 15kRPM and U320
> 10kRPM ?
> Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB )
> and something like memory backup write cache (like HP DL 380 server) ?
> Do you use Intel or Opteron cpus ?

The 15k drives have higher sustained throughput so theoretically they
would be faster for sequential scans of data.  I have no hard numbers
about this, though.  See my thread on choosing between them from last
thursday.

As for RAID controller, I've been using LSI MegaRAID 320-2x
controller lately.  I like it a lot.  I configure 1/2 the disks on
one channel and 1/2 on the other and RAID mirror and RAID 10 them
across the channels.  I *always* get battery backup for the
controllers.  No point not to do so.

As for Intel vs. Opteron: Opteron hands down.  The Intel Xeon EM64T
are adequate for low-end use, but for really pushing the bits back
and forth the Opterons are the top of the heap in performance.


Re: Performance large tables.

From
Vivek Khera
Date:
On Dec 13, 2005, at 3:50 AM, Benjamin Arai wrote:

> What kind of performance boost do you get from using raid 10?  I am
> trying
> to do a little cost analysis.

For small amounts of data you probably wont notice anything.  Once
you get into the 10's of GB you'll notice improvement when you have
lots of scattered reads as they get parallelized well.


Re: Performance large tables.

From
Greg Stark
Date:
Vivek Khera <vivek@khera.org> writes:

> On Dec 13, 2005, at 2:49 AM, Franz.Rasper@izb.de wrote:
>
> > What is the performance difference between U320 15kRPM and U320  10kRPM ?
> > Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB )
> > and something like memory backup write cache (like HP DL 380 server) ?
> > Do you use Intel or Opteron cpus ?
>
> The 15k drives have higher sustained throughput so theoretically they  would be
> faster for sequential scans of data.  I have no hard numbers  about this,
> though.  See my thread on choosing between them from last  thursday.

Actually the 15k drives have only moderately higher throughput. The top of the
line 15k Maxtor has a maximum throughput of 98MB/s while my 3 year old 7200
rpm drive can get over 50MB/s. Newer 7200rpm drives would be better but they
don't seem to include throughput in their specs.

While that's not bad, the difference in seek time and rotational latency is
the main advantage of a faster drive. The seek time of a 7200rpm is about
8-9ms and the rotational latency about 4ms. The seek time of the to of the
line 15kRPM drive is about 3ms and the rotational latency 2ms.

In short while the throughput is less than doubled, the speed for random
access reads is almost tripled.

--
greg