Home > mailing lists

Re: Massive performance issues - Mailing list pgsql-performance

From	Ron
Subject	Re: Massive performance issues
Date	September 1, 2005 20:09:46
Msg-id	6.2.3.4.0.20050901184427.05ecb4a8@pop.earthlink.net Whole thread Raw
In response to	Re: Massive performance issues (Matthew Sackman <matthew@lshift.net>)
List	pgsql-performance

Tree view

At 06:22 PM 9/1/2005, Matthew Sackman wrote:
>On Thu, Sep 01, 2005 at 06:05:43PM -0400, Ron wrote:
> >
> > Since I assume you are not going to run anything with the string
> > "unstable" in its name in production (?!), why not try a decent
> > production ready distro like SUSE 9.x and see how pg 8.0.3 runs on a
> > OS more representative of what you are likely (or at least what is
> > safe...) to run in production?
>
>Well, you see, as ever, it's a bit complicated. The company I'm doing
>the development for has been subcontracted to do it and the contractor was
>contracted by the actual "client". So there are two companies involved
>in addition to the "client". Sadly, the "client" actually has dictated
>things like "it will be deployed on FreeBSD and thou shall not argue".

At least get them to promise they will use a release the BSD folks
mark "stable"!


>At this point in time, I actually have very little information about the
>specification of the boxen that'll be running this application. This is
>something I'm hoping to solve very soon. The worst part of it is that
>I'm not going have direct (ssh) access to the box and all configuration
>changes will most likely have to be relayed through techies at the
>"client" so fine tuning this is going to be a veritable nightmare.

IME, what you have actually just said is "It will not be possible to
safely fine tune the DB unless or until I have direct access; and/or
someone who does have direct access is correctly trained."

Ick.


> > >> > I need to get to the stage where I can run queries such as:
> > >> > select street, locality_1, locality_2, city from address
> > >> > where (city = 'Nottingham' or locality_2 = 'Nottingham'
> > >> >        or locality_1 = 'Nottingham')
> > >> >   and upper(substring(street from 1 for 1)) = 'A'
> > >> > group by street, locality_1, locality_2, city
> > >> > order by street
> > >> > limit 20 offset 0
> > >>
> > >> This might be a lot quicker than pulling all the records like in
> > >your example
> > >> queries...
> > >
> > >Yes, that certainly does seem to be the case - around 4 seconds. But I
> > >need it to be 10 times faster (or thereabouts) otherwise I have big
> > >problems!
> >
> > *beats drum* Get it in RAM, Get it in RAM, ...
>
>Ok, but I currently have 2 million rows. When this launches in a couple
>of weeks, it'll launch with 5 million+ and then gain > a million a year.

At my previously mentioned optimum of 85B per row, 2M rows is
170MB.  5M rows is 425MB.  Assuming the gain of 1M rows per year,
that's +85MB per year for this table.

Up to 2GB DIMMs are currently standard, and 4GB DIMMs are just in the
process of being introduced.  Mainboards with anything from 4 to 16
DIMM slots are widely available.

IOW, given the description you've provided this DB should _always_
fit in RAM.  Size the production system such that the entire DB fits
into RAM during ordinary operation with an extra 1GB of RAM initially
tossed on as a safety measure and the client will be upgrading the HW
because it's obsolete before they run out of room in RAM.


>I think the upshot of this all is 4GB RAM as a minimum and judicious use
>of normalization so as to avoid more expensive string comparisons and
>reduce table size is my immediate plan (along with proper configuration
>of pg).

My suggestion is only slightly different.  Reduce table size(s) and
up the RAM to the point where the whole DB fits comfortably in RAM.

You've got the rare opportunity to build a practical Memory Resident
Database.  It should run like a banshee when you're done.  I'd love
to see the benches on the final product.

Ron Peacetree

pgsql-performance by date:

From: Guido Neitzer
Date: 01 September 2005, 20:09:09
Subject: Prepared statement not using index

From: alison@mirrabooka.com (Alison Winters)
Date: 01 September 2005, 20:23:50
Subject: Re: lots of updates on small table

Re: Massive performance issues - Mailing list pgsql-performance

Previous

Next