Thread: AW: [HACKERS] Some notes on optimizer cost estimates
> > Couldn't we test some of these parameters inside configure and set > > them there? > > If we could figure out a reasonably cheap way of estimating these > numbers, it'd be worth setting up custom values at installation time. Imho this whole idea is not so good. (Sorry) My points are: 1. even if it is good for an optimizer to be smart,it is even more important, that it is predictable 2. I compile on test machine, production is completely different(more processors, faster disks and controllers) Andreas
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes: > My points are: > 1. even if it is good for an optimizer to be smart, > it is even more important, that it is predictable A good point indeed. And unless we find that there is a huge range in the ratios across different machines, we'd be wasting our time trying to calibrate the numbers for a particular machine --- we could just as well use an average value. The optimizer has many other, far worse, sources of error than that. > 2. I compile on test machine, production is completely different > (more processors, faster disks and controllers) In practice we'd do this at initdb time, not configure time, so I'm not sure that that's really an issue. But your other point is well taken. regards, tom lane
At 13:17 24/01/00 -0500, Tom Lane wrote: >Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes: >> My points are: >> 1. even if it is good for an optimizer to be smart, >> it is even more important, that it is predictable > >A good point indeed. And unless we find that there is a huge range in >the ratios across different machines, we'd be wasting our time trying to >calibrate the numbers for a particular machine --- we could just as well >use an average value. The optimizer has many other, far worse, sources >of error than that. > >> 2. I compile on test machine, production is completely different >> (more processors, faster disks and controllers) > >In practice we'd do this at initdb time, not configure time, so I'm >not sure that that's really an issue. But your other point is >well taken. I would guess it would become an issue if a server is upgraded (better/more disks, faster CPU etc). This could be fixed by storing the optimizer settings in a system table in the DB, and reading them the first time a backend opens it. Just an idea. If you *do* go with the 'store them in the DB' solution, then you also need to provide an way of updating them (SQL, presumably), and a utility to refresh them based on the current hardware. What this then amounts to is 'hand-tuning' of optimizer settings, which is an old chestnut that I would like to see reconsidered - sometimes, for specific important queries, it is very good to be able to tell the DB how to go about satisfying the query (or at least, what join order to use and which indices to scan). Is this so far removed from the above, philosophically? Is it worth considering? Bye for now, Philip Warner. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.C.N. 008 659 498) | /(@) ______---_ Tel: +61-03-5367 7422 | _________ \ Fax: +61-03-5367 7430 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
On 2000-01-24, Zeugswetter Andreas SB mentioned: > > > > Couldn't we test some of these parameters inside configure and set > > > them there? > > > > If we could figure out a reasonably cheap way of estimating these > > numbers, it'd be worth setting up custom values at installation time. > > Imho this whole idea is not so good. (Sorry) > > My points are: > 1. even if it is good for an optimizer to be smart, > it is even more important, that it is predictable ISTM that by the nature of things the most important capability of an optimizer is to yield optimal results. This, however, does not have to be mutually exclusive with predictability. If you estimate some CPU and disk parameters and write them into a config file, then you can always give this config file to a bug fixer. It will still work on his machine, just less than optimally. > 2. I compile on test machine, production is completely different > (more processors, faster disks and controllers) You're completely right. This has no place in configure. It will have to be a separate tool which you can run after building and installing. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
At 12:16 AM 1/26/00 +0100, Peter Eisentraut wrote: >On 2000-01-24, Zeugswetter Andreas SB mentioned: >> My points are: >> 1. even if it is good for an optimizer to be smart, >> it is even more important, that it is predictable > >ISTM that by the nature of things the most important capability of an >optimizer is to yield optimal results. One problem, though, is that the optimization problem's even more intractable for database systems than it is for compilers. Large and growing database installations go through an evolutionary process of adding memory, spindles, partioning of data, etc. Changing from the "-B" default to "-B 2000", as I've done on my web site, causes the database to live in shared memory even when other activity on the system would tend to cause some of it to be flushed from the filesystem cache. This changes how long, on average, it takes to read a block. And the first time a table's referenced is always going to take longer than subsequent references if there's caching involved...so in theory the optimizer should take that into account if it makes presumptions about cache hit ratios. So forth and so forth. I'm not disagreeing in the least with the spirit of your comment. There are all sorts of practical barriers... Thankfully, for the particular problem of choosing between and index vs. sequential scan, the optimizer only has two choices to make. Thus a rough-and-ready heuristic based on certain assumptions might work well, which is essentially what the optimizer does today. In fact, it does work quite well, come to think of it! Fiddling the numbers underlying the assumptions may be good enough for this task. > This, however, does not have to be >mutually exclusive with predictability. If you estimate some CPU and disk >parameters and write them into a config file, then you can always give >this config file to a bug fixer. It will still work on his machine, just >less than optimally. > >> 2. I compile on test machine, production is completely different >> (more processors, faster disks and controllers) > >You're completely right. This has no place in configure. It will have to >be a separate tool which you can run after building and installing. Based perhaps on statistics gathered while the system is running... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.