Using operators to do query hints - Mailing list pgsql-hackers

From Greg Stark
Subject Using operators to do query hints
Date
Msg-id CAM-w4HOD7c7RozbOYXfFg0iDons5amoDJkP+bwNXTVV_2jUqAg@mail.gmail.com
Whole thread Raw
Responses Re: Using operators to do query hints
List pgsql-hackers
I've been playing with an idea I had a while back. Basically that it
would be useful to have some "noop" operators that are used purely to
influence the planner.

For context I've suggested in the past that there are two categories of hints:

1 Hints that override the planner's decisions with explicit planning
decisions. These we don't like for a variety of reasons, notably
because users don't have very good knowledge of all the plan types
available and new plan types come out in the future.

2 Hints that just influence the estimates that the planner uses to
make its decisions. These seem more acceptable as it amounts to
admitting the users know the distribution of their data and the
behaviour of their functions better than the statistics. Also, there
are plenty of cases where the statistics are really just wild guesses.
They still allow the planner to make decisions about what join orders
or types and so on given the updated data.

So I thought adding some noop operators that took a boolean on one
side and a float on the other and simply returned the boolean at
run-time but used the float (which would have to be a constant) from
the lhs as the selectivity would be useful.

In practice it's usable but not nearly as widely useful as I expected.

1) The boolean being passed through has to be a real clause with real
fields. Otherwise the planner is too clever and turns it into a
one-time filter which doesn't affect the plan :/

2) If it's a clauase that is potentially going to be an index
condition then you need to repeat the clause outside the selectivity
noop operator. Otherwise the selectivity operator hides it from the
planner.

3) It doesn't work on joins at all. Joins are done using the join
selectivity function on the join clause's operator. There's no way to
pass construct a simple noop wrapper that would still work with that
that I can see.

Nonetheless what I have does seem to be somewhat handy for simple cases.

I added some simple SQL wrapper functions such as pg_unlikely():

postgres=# explain select * from test where i<100;
                             QUERY PLAN
---------------------------------------------------------------------
 Index Only Scan using i on test  (cost=0.28..10.01 rows=99 width=4)
   Index Cond: (i < 100)
(2 rows)

postgres=# explain select * from test where pg_unlikely(i<100);
                             QUERY PLAN
--------------------------------------------------------------------
 Index Only Scan using i on test  (cost=0.28..10.50 rows=1 width=4)
   Index Cond: (i < 100)
   Filter: ('1e-06'::double precision %%% (i < 100))
(3 rows)

However this doesn't really do what you might really be hoping.
specifically, it doesn't actually affect the planner's choice of the
index scan in that query. It affects the estimate of the result of the
scan and that might be useful for subsequent nodes of the plan. But
not for the index scan itself:

postgres=# explain select * from test where pg_unlikely(i<500);
                             QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=1 width=4)
   Filter: ((i < 500) AND ('1e-06'::double precision %%% (i < 500)))
(2 rows)


So..... I dunno. This feels like it's something that could be quite
handy but it's not as good as I had hoped and I think I'm out of ideas
for making it more powerful.


-- 
greg



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: C++ Trigger Framework
Next
From: Shmuel Kamensky
Date:
Subject: Re: C++ Trigger Framework