Thread: Using operators to do query hints

Using operators to do query hints

From
Greg Stark
Date:
I've been playing with an idea I had a while back. Basically that it
would be useful to have some "noop" operators that are used purely to
influence the planner.

For context I've suggested in the past that there are two categories of hints:

1 Hints that override the planner's decisions with explicit planning
decisions. These we don't like for a variety of reasons, notably
because users don't have very good knowledge of all the plan types
available and new plan types come out in the future.

2 Hints that just influence the estimates that the planner uses to
make its decisions. These seem more acceptable as it amounts to
admitting the users know the distribution of their data and the
behaviour of their functions better than the statistics. Also, there
are plenty of cases where the statistics are really just wild guesses.
They still allow the planner to make decisions about what join orders
or types and so on given the updated data.

So I thought adding some noop operators that took a boolean on one
side and a float on the other and simply returned the boolean at
run-time but used the float (which would have to be a constant) from
the lhs as the selectivity would be useful.

In practice it's usable but not nearly as widely useful as I expected.

1) The boolean being passed through has to be a real clause with real
fields. Otherwise the planner is too clever and turns it into a
one-time filter which doesn't affect the plan :/

2) If it's a clauase that is potentially going to be an index
condition then you need to repeat the clause outside the selectivity
noop operator. Otherwise the selectivity operator hides it from the
planner.

3) It doesn't work on joins at all. Joins are done using the join
selectivity function on the join clause's operator. There's no way to
pass construct a simple noop wrapper that would still work with that
that I can see.

Nonetheless what I have does seem to be somewhat handy for simple cases.

I added some simple SQL wrapper functions such as pg_unlikely():

postgres=# explain select * from test where i<100;
                             QUERY PLAN
---------------------------------------------------------------------
 Index Only Scan using i on test  (cost=0.28..10.01 rows=99 width=4)
   Index Cond: (i < 100)
(2 rows)

postgres=# explain select * from test where pg_unlikely(i<100);
                             QUERY PLAN
--------------------------------------------------------------------
 Index Only Scan using i on test  (cost=0.28..10.50 rows=1 width=4)
   Index Cond: (i < 100)
   Filter: ('1e-06'::double precision %%% (i < 100))
(3 rows)

However this doesn't really do what you might really be hoping.
specifically, it doesn't actually affect the planner's choice of the
index scan in that query. It affects the estimate of the result of the
scan and that might be useful for subsequent nodes of the plan. But
not for the index scan itself:

postgres=# explain select * from test where pg_unlikely(i<500);
                             QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=1 width=4)
   Filter: ((i < 500) AND ('1e-06'::double precision %%% (i < 500)))
(2 rows)


So..... I dunno. This feels like it's something that could be quite
handy but it's not as good as I had hoped and I think I'm out of ideas
for making it more powerful.


-- 
greg



Re: Using operators to do query hints

From
Bruce Momjian
Date:
On Tue, Feb 22, 2022 at 04:12:15PM -0500, Greg Stark wrote:
> I've been playing with an idea I had a while back. Basically that it
> would be useful to have some "noop" operators that are used purely to
> influence the planner.
> 
> For context I've suggested in the past that there are two categories of hints:
> 
> 1 Hints that override the planner's decisions with explicit planning
> decisions. These we don't like for a variety of reasons, notably
> because users don't have very good knowledge of all the plan types
> available and new plan types come out in the future.
> 
> 2 Hints that just influence the estimates that the planner uses to
> make its decisions. These seem more acceptable as it amounts to
> admitting the users know the distribution of their data and the
> behaviour of their functions better than the statistics. Also, there
> are plenty of cases where the statistics are really just wild guesses.
> They still allow the planner to make decisions about what join orders
> or types and so on given the updated data.

#2 is an interesting distinction.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.