Thread: Re: Performance large tables.
Benjamin Arai wrote on Saturday, December 10, 2005 3:37 PM > ... On the other hand there is a weekly update (This is the > problem) that updates all of the modified records for a bunch of > finacial data such as closes and etc. For the most part they are > records of the type name,date,value. The update currently takes almost > two days. The update does deletions, insertion, and updates depending > on what has happened from the previous week. > > For the most part the updates are simple one liners. I currently commit > in large batch to increase performance but it still takes a while as > stated above. From evaluating the computers performance during an > update, the system is thrashing both memory and disk. I experimented with batch size and found that large batches (thousands or tens of thousands) slowed things down in our situation, while using a batch size of around 100 or so sped things up tremendously. Of course, YMMV ... -Roger
Hello, may I ask you some questions. What is the performance difference between U320 15kRPM and U320 10kRPM ? Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB ) and something like memory backup write cache (like HP DL 380 server) ? Do you use Intel or Opteron cpus ? regards, -Franz -----Ursprüngliche Nachricht----- Von: Vivek Khera [mailto:vivek@khera.org] Gesendet: Montag, 12. Dezember 2005 23:15 An: PG-General General Betreff: Re: [GENERAL] Performance large tables. On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote: > For the most part the updates are simple one liners. I currently > commit in large batch to increase performance but it still takes a > while as stated above. From evaluating the computers performance > during an update, the system is thrashing both memory and disk. I > am currently using Postgresql 8.0.3. Then buy faster disks. My current favorite is to use U320 15kRPM disks using a dual-chanel RAID controller with 1/2 the disks on one channel and 1/2 on the other and mirroring them across channels, then striping down the mirrors (ie, RAID10). I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a RAID1. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
What kind of performance boost do you get from using raid 10? I am trying to do a little cost analysis. Benjamin Arai barai@cs.ucr.edu benjamin@cs.ucr.edu http://www.benjaminarai.com > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of > Franz.Rasper@izb.de > Sent: Monday, December 12, 2005 11:50 PM > To: vivek@khera.org > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Performance large tables. > > Hello, > > may I ask you some questions. > > What is the performance difference between U320 15kRPM and > U320 10kRPM ? > Does your RAID crontoller has some memory (e.g. 128 MB or 256 > MB ) and something like memory backup write cache (like HP DL > 380 server) ? > Do you use Intel or Opteron cpus ? > > regards, > > -Franz > > -----Ursprüngliche Nachricht----- > Von: Vivek Khera [mailto:vivek@khera.org] > Gesendet: Montag, 12. Dezember 2005 23:15 > An: PG-General General > Betreff: Re: [GENERAL] Performance large tables. > > > > On Dec 10, 2005, at 6:37 PM, Benjamin Arai wrote: > > > For the most part the updates are simple one liners. I currently > > commit in large batch to increase performance but it still takes a > > while as stated above. From evaluating the computers performance > > during an update, the system is thrashing both memory and > disk. I > > am currently using Postgresql 8.0.3. > > Then buy faster disks. My current favorite is to use U320 15kRPM > disks using a dual-chanel RAID controller with 1/2 the disks on one > channel and 1/2 on the other and mirroring them across > channels, then > striping down the mirrors (ie, RAID10). > > I use no fewer than 6 disks (RAID 10) for data and 2 for pg_log in a > RAID1. > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
On Dec 13, 2005, at 2:49 AM, Franz.Rasper@izb.de wrote: > What is the performance difference between U320 15kRPM and U320 > 10kRPM ? > Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB ) > and something like memory backup write cache (like HP DL 380 server) ? > Do you use Intel or Opteron cpus ? The 15k drives have higher sustained throughput so theoretically they would be faster for sequential scans of data. I have no hard numbers about this, though. See my thread on choosing between them from last thursday. As for RAID controller, I've been using LSI MegaRAID 320-2x controller lately. I like it a lot. I configure 1/2 the disks on one channel and 1/2 on the other and RAID mirror and RAID 10 them across the channels. I *always* get battery backup for the controllers. No point not to do so. As for Intel vs. Opteron: Opteron hands down. The Intel Xeon EM64T are adequate for low-end use, but for really pushing the bits back and forth the Opterons are the top of the heap in performance.
On Dec 13, 2005, at 3:50 AM, Benjamin Arai wrote: > What kind of performance boost do you get from using raid 10? I am > trying > to do a little cost analysis. For small amounts of data you probably wont notice anything. Once you get into the 10's of GB you'll notice improvement when you have lots of scattered reads as they get parallelized well.
Vivek Khera <vivek@khera.org> writes: > On Dec 13, 2005, at 2:49 AM, Franz.Rasper@izb.de wrote: > > > What is the performance difference between U320 15kRPM and U320 10kRPM ? > > Does your RAID crontoller has some memory (e.g. 128 MB or 256 MB ) > > and something like memory backup write cache (like HP DL 380 server) ? > > Do you use Intel or Opteron cpus ? > > The 15k drives have higher sustained throughput so theoretically they would be > faster for sequential scans of data. I have no hard numbers about this, > though. See my thread on choosing between them from last thursday. Actually the 15k drives have only moderately higher throughput. The top of the line 15k Maxtor has a maximum throughput of 98MB/s while my 3 year old 7200 rpm drive can get over 50MB/s. Newer 7200rpm drives would be better but they don't seem to include throughput in their specs. While that's not bad, the difference in seek time and rotational latency is the main advantage of a faster drive. The seek time of a 7200rpm is about 8-9ms and the rotational latency about 4ms. The seek time of the to of the line 15kRPM drive is about 3ms and the rotational latency 2ms. In short while the throughput is less than doubled, the speed for random access reads is almost tripled. -- greg