Re: question about index cost estimates - Mailing list pgsql-hackers
From | Jeff Hoffmann |
---|---|
Subject | Re: question about index cost estimates |
Date | |
Msg-id | 39234018.550E0BE6@propertykey.com Whole thread Raw |
In response to | question about index cost estimates (Jeff Hoffmann <jeff@propertykey.com>) |
Responses |
Re: question about index cost estimates
|
List | pgsql-hackers |
Tom Lane wrote: > If the table is bigger than the disk cache, and we are hitting pages > more than once, then we are likely to have to fetch some pages more than > once. The model is assuming that the tuples we want are scattered more > or less randomly over the whole table. If #tuples to be fetched > approaches or exceeds #pages in table, then clearly we are going to be > touching some pages more than once because we want more than one tuple > from them. i understand that, but still, if you have a tuple thats, say, 144 bytes, you're going to have a higher chance of revisiting the same page than if the tuple is 4k. it may be something that works itself out of the equation somewhere. > The $64 question is did the page fall out of buffer cache > between references? Worst case (if your table vastly exceeds your > available buffer space), practically every tuple fetch will require a > separate physical read, and then the number of page fetches is > essentially the number of tuples returned --- which of course can be > a lot more than the number of pages in the table. i understand that, too. i understand the need for a curve like that, maybe my question should be the slope of the curve. it turns out that playing around with my queries, it seems like when i was selecting about 10% of the records, the page fetch estimate is pretty accurate, although when the selectivity falls to significantly below that, the page fetch estimate stays quite a bit higher than actual. part of what i'm doing is looking at the executor stats for each query. can you explain what the shared blocks, local blocks, and direct blocks mean? i'm assuming that the shared blocks refers to the reads to the database files & indexes. what do the other mean? are the shared blocks shared amongst all of the backends or does each have its own pool? i'm getting a 90%+ buffer hit rate on index scans, but i'm assuming that's because i'm the only one doing anything on this machine right now and that would go down with more processes. > > Right now these considerations are divided between cost_index() and > cost_nonsequential_access() in a way that might well be wrong. > I've been intending to dig into the literature and try to find a better > cost model but haven't gotten to it yet. well, there have been a lot of things that i've looked at and thought "that doesn't sound right", like for example the random page access cost, which defaults to 4.0. i thought that was too high, but i ran bonnie on my hard drive and it looks like that actual value is around 3.9. so now i'm a believer. somebody must have put some thought into somewhere. i'm still taking everything with a grain of salt until i can explain it, though. jeff
pgsql-hackers by date: