Re: [HACKERS] ASOF join - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] ASOF join
Date
Msg-id CAEepm=2BoDCetwbpJV2KSamr6SWGTW6vjfDE5xuqdFDssYjwoQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] ASOF join  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
On Wed, Jun 21, 2017 at 9:46 PM, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> Thank you for this idea. I agree that it is the best way of implementing
> ASOF join - just as optimization of standard SQL query.

Great.  I think this part definitely has potential.

> But do you think that still it will be good idea to extend SQL syntax with
> ASOF JOIN ... USING ... clause? It will significantly simplify writing
> queries like above
> and IMHO doesn't introduce some confusions with standard SQL syntax. My
> primary idea of suggesting ASOF join for Postgres was not  just building
> more efficient plan (using merge join instead of nested loop) but also
> simplifying writing of such queries. Or do you think that nobody will be
> interested in non-standard SQL extensions?

I can see the appeal, but I expect it to be difficult to convince the
project to accept a non-standard syntax for a niche use case that can
be expressed already.  Q is super terse and designed for time series
data.  SQL is neither of those things.

Some first reactions to the syntaxes you mentioned:

1.  times LEFT ASOF JOIN ticks ON ticks.time <= times.time
2.  times LEFT ASOF JOIN ticks USING (time)
3.  times LEFT ASOF JOIN ticks USING (ticks.time, times.time)

The USING ideas don't seem to be general enough, because there is no
place to say whether to use a lower or higher value if there is no
match, or did I miss something?  Relying on an ORDER BY clause in the
query to control the meaning of the join seems too weird, and making
it always (for example) <= would be an arbitrary limitation.  The
first syntax at least has enough information: when you say one of <,
>, <=, >= you also imply the search order.  I'm not sure if there are
any problems with that, perhaps when combined with other quals.

The equivalent nearly-standard syntax is definitely quite verbose, but
it has the merit of being absolutely explicit about which row from
'ticks' will be selected:
 times LEFT JOIN LATERAL (SELECT * FROM ticks                           WHERE ticks.time <= times.time
     ORDER BY ticks.time DESC LIMIT 1) x ON true
 

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] An attempt to reduce WALWriteLock contention
Next
From: Kuntal Ghosh
Date:
Subject: Re: [HACKERS] An attempt to reduce WALWriteLock contention