Re: Read/Write block sizes - Mailing list pgsql-performance

From Chris Browne
Subject Re: Read/Write block sizes
Date
Msg-id 60br3npb4p.fsf@dba2.int.libertyrms.com
Whole thread Raw
In response to Re: Read/Write block sizes (Was: Caching by Postgres)  (Jignesh Shah <J.K.Shah@Sun.COM>)
Responses Re: Read/Write block sizes  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-performance
spoe@sfnet.cc (Steve Poe) writes:
> Chris,
>
> Unless I am wrong, you're making the assumpting the amount of time spent
> and ROI is known. Maybe those who've been down this path know how to get
> that additional 2-4% in 30 minutes or less?
>
> While each person and business' performance gains (or not) could vary,
> someone spending the 50-100h to gain 2-4% over a course of a month for a
> 24x7 operation would seem worth the investment?

What we *do* know is that adding these "knobs" would involve a
significant amount of effort, as the values are widely used throughout
the database engine.  Making them dynamic (e.g. - so they could be
tuned on a tablespace-by-tablespace basis) would undoubtedly require
rather a lot of development effort.  They are definitely NOT 30 minute
changes.

Moreover, knowing how to adjust them is almost certainly also NOT a 30
minute configuration change; significant benchmarking effort for the
individual application is almost sure to be needed.

It's not much different from the reason why PostgreSQL doesn't use
threading...

The problem with using threading is that introducing it to the code
base would require a pretty enormous amount of effort (I'll bet
multiple person-years), and it wouldn't provide *any* benefit until
you get rather a long ways down the road.

Everyone involved in development seems to me to have a reasonably keen
understanding as to what the potential benefits of threading are; the
value is that there fall out plenty of opportunities to parallelize
the evaluation of portions of queries.  Alas, it wouldn't be until
*after* all the effort goes in that we would get any idea as to what
kinds of speedups this would provide.

In effect, there has to be a year invested in *breaking* PostgreSQL
(because this would initially break a lot, since thread programming is
a really tough skill) where you don't actually see any benefits.

> I would assume that dbt2 with STP helps minimize the amount of hours
> someone has to invest to determine performance gains with
> configurable options?

That's going to help in constructing a "default" knob value.  And if
we find an "optimal default," that encourages sticking with the
current approach, of using #define to apply that value...

>> If someone spends 100h working on one of these items, and gets a 2%
>> performance improvement, that's almost certain to be less desirable
>> than spending 50h on something else that gets a 4% improvement.
>>
>> And we might discover that memory management improvements in Linux
>> 2.6.16 or FreeBSD 5.5 allow some OS kernels to provide some such
>> improvements "for free" behind our backs without *any* need to write
>> database code.  :-)
--
let name="cbbrowne" and tld="ntlug.org" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/lisp.html
"For those  of you who are  into writing programs that  are as obscure
and complicated  as possible, there are opportunities  for... real fun
here" -- Arthur Norman

pgsql-performance by date:

Previous
From: Arjen van der Meijden
Date:
Subject: Re: performance drop on RAID5
Next
From: Chris Browne
Date:
Subject: Re: Performance for relative large DB