Re: extremly low memory usage - Mailing list pgsql-performance

From Ron
Subject Re: extremly low memory usage
Date
Msg-id 6.2.3.4.0.20050821185759.01f7c720@pop.earthlink.net
Whole thread Raw
In response to extremly low memory usage  (Jeremiah Jahn <jeremiah@cs.earlham.edu>)
List pgsql-performance
I'm resending this as it appears not to have made it to the list.

At 10:54 AM 8/21/2005, Jeremiah Jahn wrote:
>On Sat, 2005-08-20 at 21:32 -0500, John A Meinel wrote:
> > Ron wrote:
> >
> > Well, since you can get a read of the RAID at 150MB/s, that means that
> > it is actual I/O speed. It may not be cached in RAM. Perhaps you could
> > try the same test, only using say 1G, which should be cached.
>
>[root@io pgsql]# time dd if=/dev/zero of=testfile bs=1024 count=1000000
>1000000+0 records in
>1000000+0 records out
>
>real    0m8.885s
>user    0m0.299s
>sys     0m6.998s

This is abysmally slow.


>[root@io pgsql]# time dd of=/dev/null if=testfile bs=1024 count=1000000
>1000000+0 records in
>1000000+0 records out
>
>real    0m1.654s
>user    0m0.232s
>sys     0m1.415s

This transfer rate is the only one out of the 4 you have posted that
is in the vicinity of where it should be.


>The raid array I have is currently set up to use a single channel. But I
>have dual controllers in the array. And dual external slots on the card.
>The machine is brand new and has pci-e backplane.
>
So you have 2 controllers each with 2 external slots?  But you are
currently only using 1 controller and only one external slot on that
controller?


> > > Assuming these are U320 15Krpm 147GB HDs, a RAID 10 array of 14 of them
> > > doing raw sequential IO like this should be capable of at
> > >  ~7*75MB/s= 525MB/s using Seagate Cheetah 15K.4's
>BTW I'm using Seagate Cheetah 15K.4's

OK, now we have that nailed down.


> > > AFAICT, the Dell PERC4 controllers use various flavors of the LSI Logic
> > > MegaRAID controllers.  What I don't know is which exact one yours is,
> > > nor do I know if it (or any of the MegaRAID controllers) are high
> > > powered enough.
>
>PERC4eDC-PCI Express, 128MB Cache, 2-External Channels

Looks like they are using the LSI Logic MegaRAID SCSI 320-2E
controller.  IIUC, you have 2 of these, each with 2 external channels?

The specs on these appear a bit strange.  They are listed as being a
PCI-Ex8 card, which means they should have a max bandwidth of 20Gb/s=
2GB/s, yet they are also listed as only supporting dual channel U320=
640MB/s when they could easily support quad channel U320=
1.28GB/s.  Why bother building a PCI-Ex8 card when only a PCI-Ex4
card (which is a more standard physical format) would've been
enough?  Or if you are going to build a PCI-Ex8 card, why not support
quad channel U320?  This smells like there's a problem with LSI's design.

The 128MB buffer also looks suspiciously small, and I do not see any
upgrade path for it on LSI Logic's site.  "Serious" RAID controllers
from companies like Xyratex, Engino, and Dot-hill can have up to
1-2GB of buffer, and there's sound technical reasons for it.  See if
there's a buffer upgrade available or if you can get controllers that
have larger buffer capabilities.

Regardless of the above, each of these controllers should still be
good for about 80-85% of 640MB/s, or ~510-540 MB/s apiece when doing
raw sequential IO if you plug 3-4 fast enough HD's into each SCSI
channel.  Cheetah 15K.4's certainly are fast enough.  Optimal setup
is probably to split each RAID 1 pair so that one HD is on each of
the SCSI channels, and then RAID 0 those pairs.  That will also
protect you from losing the entire disk subsystem if one of the SCSI
channels dies.

That 128MB of buffer cache may very well be too small to keep the IO
rate up, and/or there may be a more subtle problem with the LSI card,
and/or you may have a configuration problem, but _something(s)_ need
fixing since you are only getting raw sequential IO of ~100-150MB/s
when it should be above 500MB/s.

This will make the most difference for initial reads (first time you
load a table, first time you make a given query, etc) and for any writes.

Your HW provider should be able to help you, even if some of the HW
in question needs to be changed.  You paid for a solution.  As long
as this stuff is performing at so much less then what it is supposed
to, you have not received the solution you paid for.

BTW, on the subject of RAID stripes IME the sweet spot tends to be in
the 64KB to 256KB range (very large, very read heavy data mines can
want larger RAID stripes.).  Only experimentation will tell you what
results in the best performance for your application.


>I'm not really worried about the writing, it's the reading the reading
>that needs to be faster.

Initial reads are only going to be as fast as your HD subsystem, so
there's a reason for making the HD subsystem faster even if all you
care about is reads.  In addition, I'll repeat my previous advice
that upgrading to 16GB of RAM would be well worth it for you.

Hope this helps,
Ron Peacetree



pgsql-performance by date:

Previous
From: "Jeffrey W. Baker"
Date:
Subject: Re: (Re)-indexing on updates
Next
From: Ron
Date:
Subject: Re: extremly low memory usage