Re: Moving postgresql.conf tunables into 2003... - Mailing list pgsql-performance
From | Sean Chittenden |
---|---|
Subject | Re: Moving postgresql.conf tunables into 2003... |
Date | |
Msg-id | 20030706002413.GZ72567@perrin.int.nxad.com Whole thread Raw |
In response to | Re: Moving postgresql.conf tunables into 2003... (Josh Berkus <josh@agliodbs.com>) |
Responses |
Re: Moving postgresql.conf tunables into 2003...
|
List | pgsql-performance |
> > The SGML docs aren't in the DBA's face and are way out of the way > > for DBAs rolling out a new system or who are tuning the system. > > SGML == Developer, conf == DBA. > > That's exactly my point. We cannot provide enough documentation in > the CONF file without septupling its length. IF we remove all > commentary, and instead provide a pointer to the documentation, more > DBAs will read it. Which I don't think would happen and why I think the terse bits that are included are worth while. :) > > Some of those parameters are based on hardware constraints and > > should be pooled and organized as such. > > > > random_page_cost == > > avg cost of a random disk seek/read (eg: disk seek time) == > > constant integer for a given piece of hardware > > But, you see, this is exactly what I'm talking about. > random_page_cost isn't static to a specific piece of hardware ... it > depends as well on what else is on: *) the disk/array translation: how fast data is accessed and over how many drives. *) concurrent disk activity A disk/database activity metric is different than the cost of a seek on the platters. :) Because PostgreSQL doesn't currently support such a disk concurrency metric doesn't mean that its definition should get rolled into a different number in an attempt to accommodate for a lack thereof. *) disk controller settings This class of settings falls into the same settings that affect random seeks on the platters/disk array(s). *) filesystem Again, this influences avg seek time *) OS Again, avg seek time *) distribution of records and tables This has nothing to do with PostgreSQL's random_page_cost setting other than that if data is fragmented on the platter, the disk is going to have to do a lot of seeking. This is a stat that should get set by ANALYZE, not by a human. *) arrangement of the partitions on disk Again, avg seek time. > One can certainly get a "good enough" value by benchmarking the > disk's random seek and calculating based on that ... but to get an > "ideal" value requires a long interactive session by someone with > experience and in-depth knowledge of the machine and database. An "ideal" value isn't obtained via guess and check. Checking is only the verification of some calculable set of settings....though right now those calculated settings are guessed, unfortunately. > > There are other settings that are RAM based as well, which should > > be formulaic and derived though a formula hasn't been defined to > > date. > > You seem pretty passionate about this ... how about you help me an > Kevin define a benchmarking suite when I get back into the country > (July 17)? If we're going to define formulas, it requires that we > have a near-comprehensive and consistent test database and test > battery that we can run on a variety of machines and platforms. Works for me, though a benchmark will be less valuable than adding a disk concurrency stat, improving data trend/distribution analysis, and using numbers that are concrete and obtainable through the OS kernel API or an admin manually plunking numbers in. I'm still recovering from my move from Cali to WA so with any luck, I'll be settled in by then. -sc -- Sean Chittenden
pgsql-performance by date: