Re: Volunteer to build a configuration tool - Mailing list pgsql-performance
From | Mark Lewis |
---|---|
Subject | Re: Volunteer to build a configuration tool |
Date | |
Msg-id | 1182360437.14407.40.camel@archimedes Whole thread Raw |
In response to | Re: Volunteer to build a configuration tool (Greg Smith <gsmith@gregsmith.com>) |
Responses |
Re: Volunteer to build a configuration tool
|
List | pgsql-performance |
On Wed, 2007-06-20 at 11:21 -0400, Greg Smith wrote: ... > One of the things that was surprising to me when I started looking at the > organization of the PostgreSQL buffer cache is how little gross > information about its contents is available. I kept expecting to find a > summary section where you could answer questions like "how much of the > cache currently has information about index/table X?" used as an input to > the optimizer. I understand that the design model expects much of this is > unknowable due to the interaction with the OS cache, and in earlier > versions you couldn't make shared_buffers big enough for its contents to > be all that interesting, so until recently this wasn't worth collecting. > > But in the current era, where it's feasible to have multi-GB caches > efficiently managed by PG and one can expect processor time is relatively > cheap, it seems to me one way to give a major boost to the optimizer is to > add some overhead to buffer cache management so it collects such > information. When I was trying to do a complete overhaul on the > background writer, the #1 problem was that I had to assemble my own > statistics on what was inside the buffer cache as it was scanned, because > a direct inspection of every buffer is the only way to know things like > what percentage of the cache is currently dirty. ... One problem with feeding the current state of the buffer cache to the planner is that the planner may be trying to prepare a plan which will execute 10,000 times. For many interesting queries, the state of the cache will be very different after the first execution, as indexes and active portions of tables are brought in. For that matter, an early stage of query execution could significantly change the contents of the buffer cache as seen by a later stage of the execution, even inside a single query. I'm not saying that inspecting the buffer cache more is a bad idea, but gathering useful information with the current planner is a bit tricky. For purposes of idle speculation, one could envision some non-trivial changes to PG which would make really slick use this data: (1) Allow PG to defer deciding whether to perform an index scan or sequential scan until the moment it is needed, and then ask the buffer cache what % of the pages from the relevant indexes/tables are currently cached. (2) Automatically re-plan prepared queries with some kind of frequency (exponential in # of executions? fixed-time?), to allow the plans to adjust to changes in the buffer-cache. Besides being hard to build, the problem with these approaches (or any other approach which takes current temporary state into account) is that as much as some of us might want to make use of every piece of data available to make the planner into a super-brain, there are lots of other folks who just want plan stability. The more dynamic the system is, the less predictable it can be, and especially in mission-critical stuff, predictability matters more than . Tom said it really well in a recent post, "To me, the worst catch-22 we face in this area is that we'd like the optimizer's choices of plan to be stable and understandable, but the real-world costs of queries depend enormously on short-term conditions such as how much of the table has been sucked into RAM recently by other queries. I have no good answer to that one."
pgsql-performance by date: