Re: [HACKERS] [PROPOSAL] Temporal query processing with range types - Mailing list pgsql-hackers

From Peter Moser
Subject Re: [HACKERS] [PROPOSAL] Temporal query processing with range types
Date
Msg-id 1511256990.15349.3.camel@gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Michael Paquier <michael.paquier@gmail.com>)
Re: [HACKERS] [PROPOSAL] Temporal query processing with range types  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
2017-11-14 18:42 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
> You might consider putting the rewriting into, um, the rewriter.
> It could be a separate pass after view expansion, if direct integration
> with the existing behavior seems unduly spaghetti-ish.  Or do it in
> an early phase of planning as he suggested.  There's not really that
> much difference between the rewriter and the planner for this purpose.
> Although one way to draw the distinction is that the output of the
> rewriter is (currently) still fully expressible as plain SQL, whereas
> once the planner goes into action the intermediate states of the tree
> might not really be SQL anymore (eg, it might contain join types that
> don't correspond to any SQL syntax).  So depending on what your rewrite
> emits, there would be a weak preference for calling it part of the
> rewriter or planner respectively.

2017-11-16 16:42 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:
> Another thing to think about is that even though the CURRENT
> implementation just rewrites the relevant constructs as SQL, in the
> future somebody might want to do something else.  I feel like it's not
> hard to imagine a purpose-build ALIGN or NORMALIZE join type being a
> lot faster than the version that's just done by rewriting the SQL.
> That would be more work, potentially, but it would be nice if the
> initial implementation leant itself to be extended that way in the
> future, which an all-rewriter implementation would not.  On the other
> hand, maybe an early-in-the-optimizer implementation wouldn't either,
> and maybe it's not worth worrying about it anyway.  But it would be
> cool if this worked out in a way that meant it could be further
> improved without having to change it completely.

Hi hackers,
we like to rethink our approach...

For simplicity I'll drop ALIGN for the moment and focus solely on NORMALIZE:

    SELECT * FROM (R NORMALIZE S ON R.x = S.y WITH (R.time, S.time)) c;

Our normalization executor node needs the following input (for now
expressed in plain SQL):

    SELECT R.*, p1
    FROM (SELECT *, row_id() OVER () rn FROM R) R LEFT OUTER JOIN (    SELECT y, LOWER(time) p1 FROM
S    UNION    SELECTy, UPPER(time) p1 FROM S ) S ON R.x = S.y AND p1 <@ R.time
 
    ORDER BY rn, p1;

In other words:
1) The left subquery adds an unique ID to each tuple (i.e., rn).
2) The right subquery creates two results for each input tuple: one for
   the upper and one for the lower bound of each input tuple's valid time
   column. The boundaries get put into a single (scalar) column, namely p1.
3) We join both subqueries if the normalization predicates hold (R.x = S.y)
   and p1 is inside the time of the current outer tuple.
4) Finally, we sort the result by the unique ID (rn) and p1, and give all
   columns of the outer relation, rn and p1 back.

Our first attempt to understand the new approach would be as follows: The
left base rel of the inner left-outer-join can be expressed as a WindowAgg
node. However, the right query of the join is much more difficult to build
(maybe through hash aggregates). Both queries could be put together with a
MergeJoin for instance. However, if we create the plan tree by hand and
choose algorithms for it manually, how is it possible to have it optimized
later? Or, if that is not possible, how do we choose the best algorithms
for it?

Best regards,
Anton, Johann, Michael, Peter


pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Assertion failure when the non-exclusive pg_stop_backup aborted.
Next
From: Amit Langote
Date:
Subject: Re: default range partition and constraint exclusion