Re: POC, WIP: OR-clause support for indexes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: POC, WIP: OR-clause support for indexes
Date
Msg-id CA+TgmoaOiwMXBBTYknczepoZzKTp-Zgk5ss1+CuVQE-eFTqBmA@mail.gmail.com
Whole thread Raw
In response to Re: POC, WIP: OR-clause support for indexes  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: POC, WIP: OR-clause support for indexes
List pgsql-hackers
On Mon, Nov 27, 2023 at 5:16 PM Peter Geoghegan <pg@bowt.ie> wrote:
> [ various observations ]

This all seems to make sense but I don't have anything particular to
say about it.

> I am sure that there is a great deal of truth to this. The general
> conclusion about parse analysis being the wrong place for this seems
> very hard to argue with. But I'm much less sure that there needs to be
> a conventional cost model.

I'm not sure about that part, either. The big reason we shouldn't do
this in parse analysis is that parse analysis is supposed to produce
an internal representation which is basically just a direct
translation of what the user entered. The representation should be
able to be deparsed to produce more or less what the user entered
without significant transformations. References to objects like tables
and operators do get resolved to OIDs at this stage, so deparsing
results will vary if objects are renamed or the search_path changes
and more or less schema-qualification is required or things like that,
but the output of parse analysis is supposed to preserve the meaning
of the query as entered by the user. The right place to do
optimization is in the optimizer.

But where in the optimizer to do it is an open question in my mind.

Previous discussion suggests to me that we might not really have
enough information at the beginning, because it seems like the right
thing to do depends on which plan we ultimately choose to use, which
gets to what you say here:

> The planner's cost model is supposed to have some basis in physical
> runtime costs, which is not the case for any of these transformations.
> Not in any general sense; they're just transformations that enable
> finding a cheaper way to execute the query. While they have to pay for
> themselves, in some sense, I think that that's purely a matter of
> managing the added planner cycles. In principle they shouldn't have
> any direct impact on the physical costs incurred by physical
> operators. No?

Right. It's just that, as a practical matter, some of the operators
deal with one form better than the other. So if we waited until we
knew which operator we were using to decide on which form to pick,
that would let us be smart.

> As I keep pointing out, there is a sound theoretical basis to the idea
> of normalizing to conjunctive normal form as its own standard step in
> query processing. To some extent we do this already, but it's all
> rather ad-hoc. Even if (say) the nbtree preprocessing transformations
> that I described were something that the planner already knew about
> directly, they still wouldn't really need to be costed. They're pretty
> much strictly better at runtime (at most you only have to worry about
> the fixed cost of determining if they apply at all).

It's just a matter of figuring out where we can put the logic and have
the result make sense. We'd like to put it someplace where it's not
too expensive and gets the right answer.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: POC, WIP: OR-clause support for indexes
Next
From: Tom Lane
Date:
Subject: Re: SSL tests fail on OpenSSL v3.2.0