Re: [HACKERS] [PROPOSAL] Temporal query processing with range types - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [HACKERS] [PROPOSAL] Temporal query processing with range types
Date
Msg-id CAB7nPqT_V6CjbL-Vv-skaMFesmEY5d2GWF24J61zWGxHNhwQQw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Peter Moser <pitiz29a@gmail.com>)
List pgsql-hackers
On Tue, Nov 21, 2017 at 6:36 PM, Peter Moser <pitiz29a@gmail.com> wrote:
> 2017-11-14 18:42 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> You might consider putting the rewriting into, um, the rewriter.
>> It could be a separate pass after view expansion, if direct integration
>> with the existing behavior seems unduly spaghetti-ish.  Or do it in
>> an early phase of planning as he suggested.  There's not really that
>> much difference between the rewriter and the planner for this purpose.
>> Although one way to draw the distinction is that the output of the
>> rewriter is (currently) still fully expressible as plain SQL, whereas
>> once the planner goes into action the intermediate states of the tree
>> might not really be SQL anymore (eg, it might contain join types that
>> don't correspond to any SQL syntax).  So depending on what your rewrite
>> emits, there would be a weak preference for calling it part of the
>> rewriter or planner respectively.
>
> 2017-11-16 16:42 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:
>> Another thing to think about is that even though the CURRENT
>> implementation just rewrites the relevant constructs as SQL, in the
>> future somebody might want to do something else.  I feel like it's not
>> hard to imagine a purpose-build ALIGN or NORMALIZE join type being a
>> lot faster than the version that's just done by rewriting the SQL.
>> That would be more work, potentially, but it would be nice if the
>> initial implementation leant itself to be extended that way in the
>> future, which an all-rewriter implementation would not.  On the other
>> hand, maybe an early-in-the-optimizer implementation wouldn't either,
>> and maybe it's not worth worrying about it anyway.  But it would be
>> cool if this worked out in a way that meant it could be further
>> improved without having to change it completely.
>
> Hi hackers,
> we like to rethink our approach...
>
> For simplicity I'll drop ALIGN for the moment and focus solely on NORMALIZE:
>
>     SELECT * FROM (R NORMALIZE S ON R.x = S.y WITH (R.time, S.time)) c;
>
> Our normalization executor node needs the following input (for now
> expressed in plain SQL):
>
>     SELECT R.*, p1
>     FROM (SELECT *, row_id() OVER () rn FROM R) R
>          LEFT OUTER JOIN (
>             SELECT y, LOWER(time) p1 FROM S
>             UNION
>             SELECT y, UPPER(time) p1 FROM S
>          ) S
>          ON R.x = S.y AND p1 <@ R.time
>     ORDER BY rn, p1;
>
> In other words:
> 1) The left subquery adds an unique ID to each tuple (i.e., rn).
> 2) The right subquery creates two results for each input tuple: one for
>    the upper and one for the lower bound of each input tuple's valid time
>    column. The boundaries get put into a single (scalar) column, namely p1.
> 3) We join both subqueries if the normalization predicates hold (R.x = S.y)
>    and p1 is inside the time of the current outer tuple.
> 4) Finally, we sort the result by the unique ID (rn) and p1, and give all
>    columns of the outer relation, rn and p1 back.
>
> Our first attempt to understand the new approach would be as follows: The
> left base rel of the inner left-outer-join can be expressed as a WindowAgg
> node. However, the right query of the join is much more difficult to build
> (maybe through hash aggregates). Both queries could be put together with a
> MergeJoin for instance. However, if we create the plan tree by hand and
> choose algorithms for it manually, how is it possible to have it optimized
> later? Or, if that is not possible, how do we choose the best algorithms
> for it?

As far as I can see, this patch has received some feedback. In order
to digest them properly, I am marking the patch as returned with
feedback.
-- 
Michael


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] Flexible configuration for full-text search
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling