Re: [HACKERS] [PROPOSAL] Temporal query processing with range types - Mailing list pgsql-hackers
From | Peter Moser |
---|---|
Subject | Re: [HACKERS] [PROPOSAL] Temporal query processing with range types |
Date | |
Msg-id | CAHO0eLYyvuqwF=2FsgDn1xOs_NOrFBu9Xh-Wq+aWfFy0y6=jWQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [PROPOSAL] Temporal query processing with range types (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
2017-02-15 20:24 GMT+01:00 Robert Haas <robertmhaas@gmail.com>: > So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN, > except apparently there's no join type and the optimizer can never > reorder these operations with each other or with other joins. Is that > right? The optimizer changes in this patch seem fairly minimal, so > I'm guessing it can't be doing anything very complex here. ALIGN/NORMALIZE operators are aliased from-clause items, which get rewritten into a subquery using LEFT OUTER JOIN. The main idea behind that is to reuse as much as possible of the existing PostgreSQL code, and just provide one executor function to process the output of these rewritten queries. The optimizer code is minimal, because we do not use any new constructs (except the temporal adjustment node for align/normalize) due to these rewrites. That is, all needed optimization techniques are already present. > What happens if you perform the ALIGN or NORMALIZE operation using > something other than an equality operator, like, say, less-than? Or > an arbitrary user-defined operator. It is possible to use ALIGN/NORMALIZE with user-defined functions, and non-equality operators. > There's no documentation in this patch. I'm not sure you want to go > to the trouble of writing SGML documentation until this has been > reviewed enough that it has a real chance of getting committed, but on > the other hand we're obviously all struggling to understand what it > does, so I think if not SGML documentation it at least needs a real > clear explanation of what the syntax is and does in a README or > something, even just for initial review. We are currently writing SGML documentation and extend in-code comments. Both, will be send soon. > It's hard to see what's going on here. What's ts? What's te? If you > used longer names for these things, it might be a bit more > self-documenting. ts, te describe an half-open interval in which the tuple is considered valid: [time point start, time point end). We have added an extended version of comments for both parser functions, i.e., transformTemporalAligner and transformTemporalNormalizer. See attached patch (src/backend/parser/parse_temporal.c). > If we are going to transform an ALIGN operator in to a left outer > join, why do we also have an executor node for it? > > + fcLowerLarg = makeFuncCall(SystemFuncName("lower"), > + > list_make1(crLargTs), > + > UNKNOWN_LOCATION); > + fcLowerRarg = makeFuncCall(SystemFuncName("lower"), > + > list_make1(crRargTs), > + > UNKNOWN_LOCATION); > + fcUpperLarg = makeFuncCall(SystemFuncName("upper"), > + > list_make1(crLargTs), > + > UNKNOWN_LOCATION); > + fcUpperRarg = makeFuncCall(SystemFuncName("upper"), > + > list_make1(crRargTs), > + > UNKNOWN_LOCATION); > > Why is a temporal operator calling functions that upper-case and > lower-case strings? In one sense this whole function (and much of the > nearby code) is very straightforward code and you can see exactly why > it's doing it. In another sense it's totally inscrutable: WHY is it > doing any of that stuff? These functions extract the lower-bound, and upper-bound of range types. > > - char *strategy; /* partitioning strategy > ('list' or 'range') */ > - List *partParams; /* List of PartitionElems */ > - int location; /* token > location, or -1 if unknown */ > + char *strategy; /* partitioning strategy ('list' or 'range') */ > + List *partParams; /* List of PartitionElems */ > + int location; /* token location, or > -1 if unknown */ > > I think this is some kind of mistake on your end while generating the > patch. It looks like you patched one version of the source code, and > diffed against another. Thank you for pointing this out, we fixed it in our new patch. 2017-02-16 13:41 GMT+01:00 Robert Haas <robertmhaas@gmail.com>: > Also, it sounds like all of this is intended to work with ranges that > are stored in different columns rather than with PostgreSQL's built-in > range types. Our syntax supports PostgreSQL's built-in range types and ranges that are stored in different columns. For instance, for range types an ALIGN query would look like this: SELECT * FROM (r ALIGN s ON q WITH (r.t, s.t)) c ... and for ranges in different columns like this: SELECT * FROM (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c ... where r and s are input relations, q can be any join qualifier, and r.t, s.t, r.ts, r.te, s.ts, and s.te can be any column name. The latter represent the valid time intervals, that is time point start, and time point end of each tuple for each input relation. These can be defined as four scalars, or two half-open, i.e., [), range typed values. Best regards, Anton, Johann, Michael, Peter ps. The patch has been rebased on top of commit 090f21bbad98001979da8589e9647a1d49bce4ee from "Sun Feb 19 17:18:10 2017 -0500" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: