Re: AW: [HACKERS] Some notes on optimizer cost estimates - Mailing list pgsql-hackers

From Don Baccus
Subject Re: AW: [HACKERS] Some notes on optimizer cost estimates
Date
Msg-id 3.0.1.32.20000125153804.00f98e30@mail.pacifier.com
Whole thread Raw
In response to AW: [HACKERS] Some notes on optimizer cost estimates  (Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>)
List pgsql-hackers
At 12:16 AM 1/26/00 +0100, Peter Eisentraut wrote:
>On 2000-01-24, Zeugswetter Andreas SB mentioned:

>> My points are:
>> 1. even if it is good for an optimizer to be smart,
>>     it is even more important, that it is predictable
>
>ISTM that by the nature of things the most important capability of an
>optimizer is to yield optimal results.

One problem, though, is that the optimization problem's even more
intractable for database systems than it is for compilers.  Large and
growing database installations go through an evolutionary process of
adding memory, spindles, partioning of data, etc.  Changing from the
"-B" default to "-B 2000", as I've done on my web site, causes the
database to live in shared memory even when other activity on the
system would tend to cause some of it to be flushed from the filesystem
cache.  This changes how long, on average, it takes to read a block.

And the first time a table's referenced is always going to take longer
than subsequent references if there's caching involved...so in 
theory the optimizer should take that into account if it makes presumptions
about cache hit ratios.

So forth and so forth.

I'm not disagreeing in the least with the spirit of your comment.  There
are all sorts of practical barriers...

Thankfully, for the particular problem of choosing between and index
vs. sequential scan, the optimizer only has two choices to make.  Thus
a rough-and-ready heuristic based on certain assumptions might work well,
which is essentially what the optimizer does today.  In fact, it does
work quite well, come to think of it!

Fiddling the numbers underlying the assumptions may be good enough for
this task.

> This, however, does not have to be
>mutually exclusive with predictability. If you estimate some CPU and disk
>parameters and write them into a config file, then you can always give
>this config file to a bug fixer. It will still work on his machine, just
>less than optimally.
>
>> 2. I compile on test machine, production is completely different
>>     (more processors, faster disks and controllers)
>
>You're completely right. This has no place in configure. It will have to
>be a separate tool which you can run after building and installing.

Based perhaps on statistics gathered while the system is running...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


pgsql-hackers by date:

Previous
From: Don Baccus
Date:
Subject: Re: Happy column adding (was RE: [HACKERS] Happy column dropping)
Next
From: Tom Lane
Date:
Subject: Re: Happy column adding (was RE: [HACKERS] Happy column dropping)