Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays - Mailing list pgsql-hackers

From James Coleman
Subject Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date
Msg-id CAAaqYe-rj-tfJzNU8C+C6xoNomVCtvg_oi0St2djh4=Xvzm8OQ@mail.gmail.com
Whole thread Raw
In response to Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
On Tue, Apr 28, 2020 at 1:40 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> út 28. 4. 2020 v 18:17 odesílatel James Coleman <jtc331@gmail.com> napsal:
>>
>> On Tue, Apr 28, 2020 at 11:18 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> >
>> >
>> >
>> > út 28. 4. 2020 v 16:48 odesílatel Tomas Vondra <tomas.vondra@2ndquadrant.com> napsal:
>> >>
>> >> On Tue, Apr 28, 2020 at 03:43:43PM +0200, Pavel Stehule wrote:
>> >> >út 28. 4. 2020 v 15:26 odesílatel Tomas Vondra <tomas.vondra@2ndquadrant.com>
>> >> >napsal:
>> >> >
>> >> >> ...
>> >> >>
>> >> >> >I'm not so concerned about this in any query where we have a real FROM
>> >> >> >clause because even if we end up with only one row, the relative
>> >> >> >penalty is low, and the potential gain is very high. But simple
>> >> >> >expressions in pl/pgsql, for example, are a case where we can know for
>> >> >> >certain (correct me if I've wrong on this) that we'll only execute the
>> >> >> >expression once, which means there's probably always a penalty for
>> >> >> >choosing the implementation with setup costs over the default linear
>> >> >> >scan through the array.
>> >> >> >
>> >> >>
>> >> >> What do you mean by "simple expressions"? I'm not plpgsql expert and I
>> >> >> see it mostly as a way to glue together SQL queries, but yeah - if we
>> >> >> know a given ScalarArrayOpExpr will only be executed once, then we can
>> >> >> disable this optimization for now.
>> >> >>
>> >> >
>> >> >a := a + 1
>> >> >
>> >> >is translated to
>> >> >
>> >> >SELECT $1 + 1 and save result to var a
>> >> >
>> >> >The queries like this "SELECT $1 + 1" are simple expressions. They are
>> >> >evaluated just on executor level - it skip SPI
>> >> >
>> >> >the simple expression has not FROM clause, and have to return just one row.
>> >> >I am not sure if it is required, it has to return just one column.
>>
>> Yes, this is what I meant by simple expressions.
>>
>> >> >I am not sure if executor knows so expression is executed as simply
>> >> >expressions. But probably it can be deduced from context
>> >> >
>> >>
>> >> Not sure. The executor state is created by exec_eval_simple_expr, which
>> >> calls ExecInitExprWithParams (and it's the only caller). And that in
>> >> turn is the only place that leaves (state->parent == NULL). So maybe
>> >> that's a way to identify simple (standalone) expressions? Otherwise we
>> >> might add a new EEO_FLAG_* to identify these expressions explicitly.
>>
>> I'll look into doing one of these.
>>
>> >> I wonder if it would be possible to identify cases when the expression
>> >> is executed in a loop, e.g. like this:
>> >>
>> >>      FOR i IN 1..1000 LOOP
>> >>          x := y IN (1, 2, ..., 999);
>> >>      END LOOP;
>> >>
>> >> in which case we only build the ScalarArrayOpExpr once, so maybe we
>> >> could keep the hash table for all executions. But maybe that's not
>> >> possible or maybe it's pointless for other reasons. It sure looks a bit
>> >> like trying to build a query engine from FOR loop.
>> >
>> >
>> > Theoretically it is possible, not now. But I don't think so it is necessary. I cannot to remember this pattern in
anyplpgsql code and I never seen any request on this feature. 
>> >
>> > I don't think so this is task for plpgsql engine. Maybe for JIT sometimes.
>>
>> Agreed. I'd thought about this kind of scenario when I brought this
>> up, but I think solving it would the responsibility of the pg/pgsql
>> compiler rather than the expression evaluation code, because it'd have
>> to recognize the situation and setup a shared expression evaluation
>> context to be reused each time through the loop.
>
>
> can be nice if new implementation was not slower then older in all environments and context (including plpgsql
expressions)

Agreed, which is why I'm going to look into preventing using the new
code path for simple expressions.

James



pgsql-hackers by date:

Previous
From: Juan José Santamaría Flecha
Date:
Subject: Re: PG compilation error with Visual Studio 2015/2017/2019
Next
From: "Jonah H. Harris"
Date:
Subject: Re: Proposing WITH ITERATIVE