Thread: Support functions for range types

Support functions for range types

From
Kim Johan Andersson
Date:
I was surprised by the poor performance when I first tried to use range 
types. What I expected was that the following two queries would be 
equivalent (see attached script):

postgres=# EXPLAIN ANALYZE SELECT some_number FROM integer_test WHERE 
some_number BETWEEN -2 AND 2;
 
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
  Index Only Scan using integer_test_some_number_idx on integer_test 
(cost=0.28..8.38 rows=5 width=4) (actual time=0.045..0.052 rows=5 loops=1)
    Index Cond: ((some_number >= '-2'::integer) AND (some_number <= 2))
    Heap Fetches: 5
  Planning Time: 0.319 ms
  Execution Time: 0.094 ms
(5 rows)

postgres=# EXPLAIN ANALYZE SELECT some_number FROM integer_test WHERE 
some_number <@ int4range(-2, 2, '[]');
                                                QUERY PLAN
--------------------------------------------------------------------------------------------------------
  Seq Scan on integer_test  (cost=0.00..34.01 rows=10 width=4) (actual 
time=0.585..1.136 rows=5 loops=1)
    Filter: (some_number <@ '[-2,3)'::int4range)
    Rows Removed by Filter: 1996
  Planning Time: 0.175 ms
  Execution Time: 1.164 ms
(5 rows)

But clearly, the planner is not able to use the btree index in the 
presence of the range operator.
So I attempted to add support functions for the 
'elem_contained_by_range' and 'range_contains_elem' operators (patch 
attached):
That gives the following execution plan (applied on 
26f7802beb2a4aafa0903f5bedeb7f1fa6f4f358):

 
QUERY PLAN 


-------------------------------------------------------------------------------------------------------------------------------------------------
  Index Only Scan using integer_test_some_number_idx on integer_test 
(cost=0.28..8.38 rows=10 width=4) (actual time=0.046..0.058 rows=5 loops=1)
    Index Cond: ((some_number >= '-2'::integer) AND (some_number < 3))
    Heap Fetches: 5
  Planning Time: 0.694 ms
  Execution Time: 0.114 ms
(5 rows)

That was what I was hoping to see (even though the row estimate is still 
a bit off).
Unfortunately this only works for the trivial case where the range is 
actually a constant.
The third query in the attached script (range_test.sql) produces the 
following plan, where the support function is not useful:

                                                        QUERY PLAN 


-------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=0.14..419.56 rows=22 width=12) (actual 
time=3.791..36.549 rows=121 loops=1)
    Join Filter: (integer_test.some_number <@ 
int4range(number_q.one_number, number_q.another_number, '[]'::text))
    Rows Removed by Join Filter: 21890
    CTE number_q
      ->  Function Scan on generate_series  (cost=0.00..0.14 rows=11 
width=8) (actual time=0.063..0.076 rows=11 loops=1)
    ->  CTE Scan on number_q  (cost=0.00..0.22 rows=11 width=8) (actual 
time=0.071..0.107 rows=11 loops=1)
    ->  Materialize  (cost=0.00..39.02 rows=2001 width=4) (actual 
time=0.011..0.516 rows=2001 loops=11)
          ->  Seq Scan on integer_test  (cost=0.00..29.01 rows=2001 
width=4) (actual time=0.077..1.043 rows=2001 loops=1)
  Planning Time: 3.172 ms
  Execution Time: 36.908 ms
(10 rows)

So my question here is, how to go about handling the more interesting 
cases, where we are passed a FuncExpr (instead of a Const)?
Is it even possible to return something useful in this case?

As far as I can tell, the support function is being passed a reference 
to the range constructor function when the range is not a constant.
But I don't have the insight required to build opclauses that can handle 
non-constants.
Any thoughs or pointers on solving this?

    Thanks,
            Kim Johan Andersson
Attachment

Re: Support functions for range types

From
Tom Lane
Date:
Kim Johan Andersson <kimjand@kimmet.dk> writes:
> So my question here is, how to go about handling the more interesting 
> cases, where we are passed a FuncExpr (instead of a Const)?
> Is it even possible to return something useful in this case?

Doesn't look like it to me.  You could check whether the RHS is a
range constructor function call, but there's a big semantic problem:
int4_range(NULL, ...) converts to a range with an infinite bound,
not a null bound.  So translating that to "indxvar >= NULL" would
give the wrong answers.  And generally speaking, if the argument
isn't a constant then you're not going to be able to be sure that
it doesn't produce NULL.

I guess you could produce something like

    indxvar >= coalesce(argument, minimum-value-of-type)

in cases where the data type has an identifiable minimum resp.
maximum value, but that'd make the whole affair annoyingly
data-type-specific.  Not sure it's worth going there.

            regards, tom lane



Re: Support functions for range types

From
Kim Johan Andersson
Date:
On 25-09-2022 16:43, Tom Lane wrote:
> 
> Doesn't look like it to me.  You could check whether the RHS is a
> range constructor function call, but there's a big semantic problem:
> int4_range(NULL, ...) converts to a range with an infinite bound,
> not a null bound.  So translating that to "indxvar >= NULL" would
> give the wrong answers.  And generally speaking, if the argument
> isn't a constant then you're not going to be able to be sure that
> it doesn't produce NULL.

I was suspecting that it would not be possible. So we can peek at the 
constructor parameters, but we won't know the actual value until execution.

> I guess you could produce something like
> 
>     indxvar >= coalesce(argument, minimum-value-of-type)
> 
> in cases where the data type has an identifiable minimum resp.
> maximum value, but that'd make the whole affair annoyingly
> data-type-specific.  Not sure it's worth going there.
> 

I was hoping it was possible to rely on the existing range mechanics for 
handling the data types. Along the lines of adding nodes dependent on 
the result of the constructor call.
But if there is no opportunity to make a dynamic expression suitable for 
the index, then I guess it won't be possible to make a really useful 
support function for range types.
Thanks for the input.

    Regards,
        Kim Johan Andersson



Re: Support functions for range types

From
Laurenz Albe
Date:
On Mon, 2022-09-26 at 06:57 +0200, Kim Johan Andersson wrote:
> But if there is no opportunity to make a dynamic expression suitable for 
> the index, then I guess it won't be possible to make a really useful 
> support function for range types.

I think it could still be useful if it only deals with constant operands.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com