Re: Massive performance issues - Mailing list pgsql-performance
From | Ron |
---|---|
Subject | Re: Massive performance issues |
Date | |
Msg-id | 6.2.3.4.0.20050901184427.05ecb4a8@pop.earthlink.net Whole thread Raw |
In response to | Re: Massive performance issues (Matthew Sackman <matthew@lshift.net>) |
List | pgsql-performance |
At 06:22 PM 9/1/2005, Matthew Sackman wrote: >On Thu, Sep 01, 2005 at 06:05:43PM -0400, Ron wrote: > > > > Since I assume you are not going to run anything with the string > > "unstable" in its name in production (?!), why not try a decent > > production ready distro like SUSE 9.x and see how pg 8.0.3 runs on a > > OS more representative of what you are likely (or at least what is > > safe...) to run in production? > >Well, you see, as ever, it's a bit complicated. The company I'm doing >the development for has been subcontracted to do it and the contractor was >contracted by the actual "client". So there are two companies involved >in addition to the "client". Sadly, the "client" actually has dictated >things like "it will be deployed on FreeBSD and thou shall not argue". At least get them to promise they will use a release the BSD folks mark "stable"! >At this point in time, I actually have very little information about the >specification of the boxen that'll be running this application. This is >something I'm hoping to solve very soon. The worst part of it is that >I'm not going have direct (ssh) access to the box and all configuration >changes will most likely have to be relayed through techies at the >"client" so fine tuning this is going to be a veritable nightmare. IME, what you have actually just said is "It will not be possible to safely fine tune the DB unless or until I have direct access; and/or someone who does have direct access is correctly trained." Ick. > > >> > I need to get to the stage where I can run queries such as: > > >> > select street, locality_1, locality_2, city from address > > >> > where (city = 'Nottingham' or locality_2 = 'Nottingham' > > >> > or locality_1 = 'Nottingham') > > >> > and upper(substring(street from 1 for 1)) = 'A' > > >> > group by street, locality_1, locality_2, city > > >> > order by street > > >> > limit 20 offset 0 > > >> > > >> This might be a lot quicker than pulling all the records like in > > >your example > > >> queries... > > > > > >Yes, that certainly does seem to be the case - around 4 seconds. But I > > >need it to be 10 times faster (or thereabouts) otherwise I have big > > >problems! > > > > *beats drum* Get it in RAM, Get it in RAM, ... > >Ok, but I currently have 2 million rows. When this launches in a couple >of weeks, it'll launch with 5 million+ and then gain > a million a year. At my previously mentioned optimum of 85B per row, 2M rows is 170MB. 5M rows is 425MB. Assuming the gain of 1M rows per year, that's +85MB per year for this table. Up to 2GB DIMMs are currently standard, and 4GB DIMMs are just in the process of being introduced. Mainboards with anything from 4 to 16 DIMM slots are widely available. IOW, given the description you've provided this DB should _always_ fit in RAM. Size the production system such that the entire DB fits into RAM during ordinary operation with an extra 1GB of RAM initially tossed on as a safety measure and the client will be upgrading the HW because it's obsolete before they run out of room in RAM. >I think the upshot of this all is 4GB RAM as a minimum and judicious use >of normalization so as to avoid more expensive string comparisons and >reduce table size is my immediate plan (along with proper configuration >of pg). My suggestion is only slightly different. Reduce table size(s) and up the RAM to the point where the whole DB fits comfortably in RAM. You've got the rare opportunity to build a practical Memory Resident Database. It should run like a banshee when you're done. I'd love to see the benches on the final product. Ron Peacetree
pgsql-performance by date: