Re: POC, WIP: OR-clause support for indexes - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: POC, WIP: OR-clause support for indexes |
Date | |
Msg-id | CA+TgmoaOiwMXBBTYknczepoZzKTp-Zgk5ss1+CuVQE-eFTqBmA@mail.gmail.com Whole thread Raw |
In response to | Re: POC, WIP: OR-clause support for indexes (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: POC, WIP: OR-clause support for indexes
|
List | pgsql-hackers |
On Mon, Nov 27, 2023 at 5:16 PM Peter Geoghegan <pg@bowt.ie> wrote: > [ various observations ] This all seems to make sense but I don't have anything particular to say about it. > I am sure that there is a great deal of truth to this. The general > conclusion about parse analysis being the wrong place for this seems > very hard to argue with. But I'm much less sure that there needs to be > a conventional cost model. I'm not sure about that part, either. The big reason we shouldn't do this in parse analysis is that parse analysis is supposed to produce an internal representation which is basically just a direct translation of what the user entered. The representation should be able to be deparsed to produce more or less what the user entered without significant transformations. References to objects like tables and operators do get resolved to OIDs at this stage, so deparsing results will vary if objects are renamed or the search_path changes and more or less schema-qualification is required or things like that, but the output of parse analysis is supposed to preserve the meaning of the query as entered by the user. The right place to do optimization is in the optimizer. But where in the optimizer to do it is an open question in my mind. Previous discussion suggests to me that we might not really have enough information at the beginning, because it seems like the right thing to do depends on which plan we ultimately choose to use, which gets to what you say here: > The planner's cost model is supposed to have some basis in physical > runtime costs, which is not the case for any of these transformations. > Not in any general sense; they're just transformations that enable > finding a cheaper way to execute the query. While they have to pay for > themselves, in some sense, I think that that's purely a matter of > managing the added planner cycles. In principle they shouldn't have > any direct impact on the physical costs incurred by physical > operators. No? Right. It's just that, as a practical matter, some of the operators deal with one form better than the other. So if we waited until we knew which operator we were using to decide on which form to pick, that would let us be smart. > As I keep pointing out, there is a sound theoretical basis to the idea > of normalizing to conjunctive normal form as its own standard step in > query processing. To some extent we do this already, but it's all > rather ad-hoc. Even if (say) the nbtree preprocessing transformations > that I described were something that the planner already knew about > directly, they still wouldn't really need to be costed. They're pretty > much strictly better at runtime (at most you only have to worry about > the fixed cost of determining if they apply at all). It's just a matter of figuring out where we can put the logic and have the result make sense. We'd like to put it someplace where it's not too expensive and gets the right answer. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: