Thread: WIP patch for LATERAL subqueries

WIP patch for LATERAL subqueries

From
Tom Lane
Date:
I've been idly amusing myself by trying to hack up support for
SQL-standard LATERAL subqueries.  I've got something that turns over,
more or less:

regression=# select * from int4_tbl a, lateral (select unique1,unique2 from tenk1 b where a.f1 = unique1) x;
 f1 | unique1 | unique2
----+---------+---------
  0 |       0 |    9998
(1 row)

regression=# explain select * from int4_tbl a, lateral (select unique1,unique2 from tenk1 b where a.f1 = unique1) x;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..42.55 rows=5 width=12)
   ->  Seq Scan on int4_tbl a  (cost=0.00..1.05 rows=5 width=4)
   ->  Index Scan using tenk1_unique1 on tenk1 b  (cost=0.00..8.28 rows=1 width=8)
         Index Cond: (a.f1 = unique1)
(4 rows)

but there's a good deal of work left to do, some of which could use some
discussion.

Feature/semantics issues:

Currently the patch only implements the syntax called out in the standard,
namely that you can put LATERAL in front of a <derived table>, which is
to say a parenthesized sub-SELECT in FROM.  It strikes me that it might be
worth allowing LATERAL with a function-in-FROM as well.  So basically
    LATERAL func(args) <alias>
would be an allowed abbreviation for
    LATERAL (SELECT * FROM func(args)) <alias>
Since the standard doesn't have function-in-FROM, it has nothing to say
about whether this is sane or not.  The argument for this is mainly that
SRFs are one of the main use-cases for LATERAL (replacing SRF-in-the-
SELECT-list usages), so we might as well make it convenient.  Any opinions
pro or con about that?

While fooling around in the planner I realized that I have no idea what
outer-level aggregates mean in a LATERAL subquery, and neither does
Postgres:
regression=# select 1 from tenk1 a, lateral (select * from int4_tbl b where f1 = max(a.unique1)) x;
ERROR:  plan should not reference subplan's variable
I don't see anything prohibiting this in SQL:2008, but ordinarily this
would be taken to be an outer-level aggregate, and surely that is not
sensible in the LATERAL subquery.  For the moment it seems like a good
idea to disallow it, though I am not sure where is a convenient place
to test for such things.  Has anyone got a clue about whether this is
well-defined, or is it simply an oversight in the spec?

Parser issues:

I'm reasonably happy with the grammar patch, though tempted to refactor
it to reduce the amount of duplication (and would be more tempted if we
add LATERAL function calls).  I'm thinking that an opt_alias production
could be used to eliminate the duplication, and am also strongly tempted
to move the error for no subselect alias out of the grammar and into
transformRangeSubselect.

Note that I made LATERAL be col_name_keyword.  It can no longer be allowed
as a function name because this would be formally ambiguous:
    LATERAL ((SELECT x FROM t)) t(x)
Is that a call on a function named LATERAL with a scalar-subquery
argument, or is it a LATERAL subquery with extra parentheses?  However,
there seems no point in making it fully reserved.  The <table_ref>
productions would still have to be repeated, because even with LATERAL
fully reserved, we can't combine them using an "opt_lateral" production.
On seeing "(" at the start of a FROM item, the parser doesn't know enough
to decide whether it should reduce opt_lateral to empty, which would be
the appropriate thing if the "(" starts a sub-select but not if it is,
say, a parenthesized JOIN tree.  We could only avoid that by allowing
opt_lateral before every type of table_ref and then throwing explicit
errors for the disallowed cases, which doesn't end up making the grammar
simpler.

Although lateral cross-references work okay for the successive-FROM-items
case, they don't work at all yet for JOIN cases:

regression=#  select * from int4_tbl a join lateral (select unique1,unique2 from tenk1 b where f1 = unique1) x on true;
ERROR:  column "f1" does not exist
LINE 1: ...ateral (select unique1,unique2 from tenk1 b where f1 = uniqu...
                                                             ^

regression=#  select * from int4_tbl a join lateral (select unique1,unique2 from tenk1 b where a.f1 = unique1) x on
true;
ERROR:  invalid reference to FROM-clause entry for table "a"
LINE 1: ...ateral (select unique1,unique2 from tenk1 b where a.f1 = uni...
                                                             ^
HINT:  There is an entry for table "a", but it cannot be referenced from this part of the query.

The reason that the separate-FROM-items case works is that
transformFromClause pushes each FROM-clause item into p_relnamespace and
p_varnamespace immediately after parsing it, making those names visible
during parsing of subsequent FROM items.  However, transformFromClauseItem
doesn't push the left-hand item into the lists before parsing the
right-hand item.

Now, the way this is being done currently is really pretty broken anyway.
As Andrew Gierth noted some time ago in
http://archives.postgresql.org/message-id/87ocpjscpa.fsf@news-spur.riddles.org.uk
it is incorrect to make these names visible to non-LATERAL subqueries,
because they may capture what should have been a valid reference to a
parent-level variable.  Furthermore, it's pretty grotty to allow the
reference and then have to re-scan the subquery in transformRangeSubselect
to see if we allowed anything we should have disallowed.

What I'm thinking of, but have not yet tried to code, is that the
p_relnamespace and p_varnamespace lists should be divided into pairs
(so four lists altogether per ParseState).  p_relnamespace/p_varnamespace
should always contain exactly those RTEs that are validly referenceable
by qualified or unqualified Vars (respectively) at the current point in
parsing.  The new lists, say p_relnamespace_lateral/p_varnamespace_lateral,
contain RTEs that are validly referenceable inside a LATERAL subquery
occuring at the current point in this ParseState's query.  We'd also want
a p_lateral_active boolean to show whether we're inside a LATERAL
subquery; that is what would tell variable lookup whether it should search
the p_xxx_lateral lists.  With a data structure like this, I think we
can fix things so that only valid references are ever accepted and there
is no need for rechecking in transformRangeSubselect (or
transformJoinOnClause for that matter).  The main reason for the current
arrangement is to be able to throw useful errors in case of an illegal
lateral reference; but I think we can still do that at the point of the
illegal reference, by groveling through the whole p_rtable list looking
to see if there would have been a match (which is more or less what
searchRangeTable already does for qualified references, so we'd just be
extending that approach to unqualified names).

One fine point here is that a LATERAL subquery in the RHS of a JOIN clause
is only allowed to reference the LHS of the JOIN when the join type is not
RIGHT or FULL.  The way I am inclined to implement this is to not add the
LHS to the p_xxx_lateral lists when the join type is wrong, so that the
LHS is simply not in scope in the RHS.  If you read SQL:2008 carefully,
their notion of how to handle this seems to be that the LHS *is* in scope
(since section 7.6 <table reference> syntax rule 6a doesn't say anything
about join types) but then you have to throw an error if the LHS is
actually referenced and the join type is wrong (section 7.7 <joined table>
syntax rule 2).  To do it exactly like they say, we'd need to add some
kind of annotation to the p_xxx_lateral list items when they're on the
wrong side of a join, which would be a real PITA I think.  In normal
cases, the simpler implementation would just lead to a different error
message.  But it's conceivable that it would accept a query as valid (by
resolving an ambiguous reference as matching some outer query level) when
the spec would say it's invalid.  I'm inclined to think that the spec is
simply poorly thought out here.  I note that the prohibition against
RIGHT/FULL joins is not there in SQL:99, so they definitely weren't
thinking straight then, and the section 7.7 rule looks like a band-aid
over the SQL:99 mistake rather than a good fix.  So I don't feel too bad
about deviating in this corner case, but I wonder if anyone else feels
differently, and if so whether they have an idea for a clean
implementation that matches the spec exactly.

Planner issues:

For the moment, I've hacked prepjointree.c's is_simple_subquery() to
prevent pull-up of LATERAL subqueries.  This is just to restrict the scope
of the planner changes to SubqueryScan paths/plans.  Relaxing the
restriction will require making sure that all other path/plan types can be
parameterized, which is something I figure can be left for later.  This
does mean there are cases that won't be optimized as nicely as one could
wish, but for a work-in-progress implementation that doesn't bother me.

The thing that is most worrisome in this area is that heretofore, the
planner has assumed that every relation has at least one unparameterized
path; but for a LATERAL subquery, or anything we might pull up out of it,
there are no such paths.  The main implication of this in the code is that
a RelOptInfo's cheapest_startup_path/cheapest_total_path might not exist,
at least not with the current definition that they're the cheapest
unparameterized paths.  For the moment I've dealt with this by lobotomizing
a lot of places where these paths were assumed to not be NULL.  I think
however that that's unduly constraining join planning: basically, we won't
ever consider a merge or hash join involving a still-parameterized lateral
subquery, and that's probably not good.  I'm considering altering the
definitions of these fields to be "the cheapest minimally-parameterized
paths", but haven't quite decided if that's a good idea or not.  There
are various areas such as GEQO that will crash on lateral subqueries
pending a resolution of this, because there wasn't any easy way to just
make them punt.

Also, there are at least three places --- extract_lateral_references,
set_subquery_pathlist, and identify_nestloop_extparams --- that are
independently re-deriving information about the sets of lateral references
in a subquery.  This seems like a bit of a crock; I'd be happier if we
could do the work just once in some fashion.  I note that SubLink
processing has a very similar problem of needing to pull out all the
upper references to form "args" lists, too.  Not sure how to refactor
that, but maybe we should expect the parser to provide annotation about
outer refs instead of making the planner re-derive it?

Executor issues:

AFAICT, there aren't any.  Sweet.  The parameterization work I did two
years ago held up.


Comments, better ideas?

            regards, tom lane

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 799930ad611b6e09403a3d14c5738f47cb2f3bab..f72f62b4c391e6ad42744c9c54be02fa6af716a9 100644
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** _copyRangeTblEntry(const RangeTblEntry *
*** 1973,1978 ****
--- 1973,1979 ----
      COPY_NODE_FIELD(ctecolcollations);
      COPY_NODE_FIELD(alias);
      COPY_NODE_FIELD(eref);
+     COPY_SCALAR_FIELD(lateral);
      COPY_SCALAR_FIELD(inh);
      COPY_SCALAR_FIELD(inFromCl);
      COPY_SCALAR_FIELD(requiredPerms);
*************** _copyRangeSubselect(const RangeSubselect
*** 2250,2255 ****
--- 2251,2257 ----
  {
      RangeSubselect *newnode = makeNode(RangeSubselect);

+     COPY_SCALAR_FIELD(lateral);
      COPY_NODE_FIELD(subquery);
      COPY_NODE_FIELD(alias);

diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 802b0636714d2c34013275132c531ebc7ae1bdd6..cb6e272a22f9a13bdc5bb85ef99d7933b775a1f9 100644
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
*************** _equalWindowDef(const WindowDef *a, cons
*** 2161,2166 ****
--- 2161,2167 ----
  static bool
  _equalRangeSubselect(const RangeSubselect *a, const RangeSubselect *b)
  {
+     COMPARE_SCALAR_FIELD(lateral);
      COMPARE_NODE_FIELD(subquery);
      COMPARE_NODE_FIELD(alias);

*************** _equalRangeTblEntry(const RangeTblEntry
*** 2287,2292 ****
--- 2288,2294 ----
      COMPARE_NODE_FIELD(ctecolcollations);
      COMPARE_NODE_FIELD(alias);
      COMPARE_NODE_FIELD(eref);
+     COMPARE_SCALAR_FIELD(lateral);
      COMPARE_SCALAR_FIELD(inh);
      COMPARE_SCALAR_FIELD(inFromCl);
      COMPARE_SCALAR_FIELD(requiredPerms);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b83bd1c9fdb132d8b76e65759deade49b73fb4eb..b8219af16d16f93b498f3d1e37b7a4e6e43cb934 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outRangeTblEntry(StringInfo str, const
*** 2362,2367 ****
--- 2362,2368 ----
              break;
      }

+     WRITE_BOOL_FIELD(lateral);
      WRITE_BOOL_FIELD(inh);
      WRITE_BOOL_FIELD(inFromCl);
      WRITE_UINT_FIELD(requiredPerms);
*************** _outRangeSubselect(StringInfo str, const
*** 2565,2570 ****
--- 2566,2572 ----
  {
      WRITE_NODE_TYPE("RANGESUBSELECT");

+     WRITE_BOOL_FIELD(lateral);
      WRITE_NODE_FIELD(subquery);
      WRITE_NODE_FIELD(alias);
  }
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ff77cefd07be34dc0a3363b7b1c8db85a4691e38..1eb7582914e75276967da46bb6db45fe3648167c 100644
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
*************** _readRangeTblEntry(void)
*** 1222,1227 ****
--- 1222,1228 ----
              break;
      }

+     READ_BOOL_FIELD(lateral);
      READ_BOOL_FIELD(inh);
      READ_BOOL_FIELD(inFromCl);
      READ_UINT_FIELD(requiredPerms);
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index 1e352fd3b5ea7e9961a6befe0399697a0c863e54..8a1ca78ee66f60547f6c4b4db84b0dd2a034b999 100644
*** a/src/backend/optimizer/geqo/geqo_eval.c
--- b/src/backend/optimizer/geqo/geqo_eval.c
*************** geqo_eval(PlannerInfo *root, Gene *tour,
*** 105,110 ****
--- 105,112 ----
       *
       * XXX geqo does not currently support optimization for partial result
       * retrieval --- how to fix?
+      *
+      * XXX what happens in LATERAL cases?
       */
      fitness = joinrel->cheapest_total_path->total_cost;

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index f02954982a7e41619a0282f2ce095282b30d9cd8..e911b6c9849094dbc5828d42b77db614223ba0e3 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*************** set_append_rel_pathlist(PlannerInfo *roo
*** 701,706 ****
--- 701,708 ----
          /*
           * Child is live, so add its cheapest access path to the Append path
           * we are constructing for the parent.
+          *
+          * XXX what if child is LATERAL?
           */
          subpaths = accumulate_append_subpath(subpaths,
                                               childrel->cheapest_total_path);
*************** generate_mergeappend_paths(PlannerInfo *
*** 903,908 ****
--- 905,912 ----
              /*
               * If we can't find any paths with the right order just use the
               * cheapest-total path; we'll have to sort it later.
+              *
+              * XXX what if child is LATERAL?
               */
              if (cheapest_startup == NULL || cheapest_total == NULL)
              {
*************** set_subquery_pathlist(PlannerInfo *root,
*** 1021,1026 ****
--- 1025,1031 ----
  {
      Query       *parse = root->parse;
      Query       *subquery = rte->subquery;
+     Relids        required_outer;
      bool       *differentTypes;
      double        tuple_fraction;
      PlannerInfo *subroot;
*************** set_subquery_pathlist(PlannerInfo *root,
*** 1033,1038 ****
--- 1038,1053 ----
       */
      subquery = copyObject(subquery);

+     /*
+      * If it's a LATERAL subquery, it might contain some Vars of the current
+      * query level, requiring it to be treated as parameterized.
+      */
+     required_outer = pull_varnos_of_level((Node *) subquery, 1);
+
+     /* Enforce convention that empty required_outer sets are exactly NULL */
+     if (bms_is_empty(required_outer))
+         required_outer = NULL;
+
      /* We need a workspace for keeping track of set-op type coercions */
      differentTypes = (bool *)
          palloc0((list_length(subquery->targetList) + 1) * sizeof(bool));
*************** set_subquery_pathlist(PlannerInfo *root,
*** 1134,1140 ****
      pathkeys = convert_subquery_pathkeys(root, rel, subroot->query_pathkeys);

      /* Generate appropriate path */
!     add_path(rel, create_subqueryscan_path(root, rel, pathkeys, NULL));

      /* Select cheapest path (pretty easy in this case...) */
      set_cheapest(rel);
--- 1149,1155 ----
      pathkeys = convert_subquery_pathkeys(root, rel, subroot->query_pathkeys);

      /* Generate appropriate path */
!     add_path(rel, create_subqueryscan_path(root, rel, pathkeys, required_outer));

      /* Select cheapest path (pretty easy in this case...) */
      set_cheapest(rel);
*************** debug_print_rel(PlannerInfo *root, RelOp
*** 1988,1997 ****
      printf("\tpath list:\n");
      foreach(l, rel->pathlist)
          print_path(root, lfirst(l), 1);
!     printf("\n\tcheapest startup path:\n");
!     print_path(root, rel->cheapest_startup_path, 1);
!     printf("\n\tcheapest total path:\n");
!     print_path(root, rel->cheapest_total_path, 1);
      printf("\n");
      fflush(stdout);
  }
--- 2003,2018 ----
      printf("\tpath list:\n");
      foreach(l, rel->pathlist)
          print_path(root, lfirst(l), 1);
!     if (rel->cheapest_startup_path)
!     {
!         printf("\n\tcheapest startup path:\n");
!         print_path(root, rel->cheapest_startup_path, 1);
!     }
!     if (rel->cheapest_total_path)
!     {
!         printf("\n\tcheapest total path:\n");
!         print_path(root, rel->cheapest_total_path, 1);
!     }
      printf("\n");
      fflush(stdout);
  }
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 65f86194e155508baa2e7c690a08be7d0380299a..3be0e11a9492d297d530c4ada73bfd233b35d3e5 100644
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
*************** sort_inner_and_outer(PlannerInfo *root,
*** 491,502 ****
       * explosion of mergejoin paths of dubious value.  This interacts with
       * decisions elsewhere that also discriminate against mergejoins with
       * parameterized inputs; see comments in src/backend/optimizer/README.
-      *
-      * If unique-ification is requested, do it and then handle as a plain
-      * inner join.
       */
      outer_path = outerrel->cheapest_total_path;
      inner_path = innerrel->cheapest_total_path;
      if (jointype == JOIN_UNIQUE_OUTER)
      {
          outer_path = (Path *) create_unique_path(root, outerrel,
--- 491,508 ----
       * explosion of mergejoin paths of dubious value.  This interacts with
       * decisions elsewhere that also discriminate against mergejoins with
       * parameterized inputs; see comments in src/backend/optimizer/README.
       */
      outer_path = outerrel->cheapest_total_path;
      inner_path = innerrel->cheapest_total_path;
+
+     /* Punt if either rel has only parameterized paths */
+     if (!outer_path || !inner_path)
+         return;
+
+     /*
+      * If unique-ification is requested, do it and then handle as a plain
+      * inner join.
+      */
      if (jointype == JOIN_UNIQUE_OUTER)
      {
          outer_path = (Path *) create_unique_path(root, outerrel,
*************** match_unsorted_outer(PlannerInfo *root,
*** 692,698 ****

      /*
       * If we need to unique-ify the inner path, we will consider only the
!      * cheapest-total inner.
       */
      if (save_jointype == JOIN_UNIQUE_INNER)
      {
--- 698,704 ----

      /*
       * If we need to unique-ify the inner path, we will consider only the
!      * cheapest-total inner.  XXX what about LATERAL?
       */
      if (save_jointype == JOIN_UNIQUE_INNER)
      {
*************** match_unsorted_outer(PlannerInfo *root,
*** 707,713 ****
           * enable_material is off or the path in question materializes its
           * output anyway.
           */
!         if (enable_material &&
              !ExecMaterializesOutput(inner_cheapest_total->pathtype))
              matpath = (Path *)
                  create_material_path(innerrel, inner_cheapest_total);
--- 713,719 ----
           * enable_material is off or the path in question materializes its
           * output anyway.
           */
!         if (enable_material && inner_cheapest_total != NULL &&
              !ExecMaterializesOutput(inner_cheapest_total->pathtype))
              matpath = (Path *)
                  create_material_path(innerrel, inner_cheapest_total);
*************** match_unsorted_outer(PlannerInfo *root,
*** 735,740 ****
--- 741,748 ----
           * If we need to unique-ify the outer path, it's pointless to consider
           * any but the cheapest outer.    (XXX we don't consider parameterized
           * outers, nor inners, for unique-ified cases.    Should we?)
+          *
+          * XXX wrong for LATERAL?
           */
          if (save_jointype == JOIN_UNIQUE_OUTER)
          {
*************** match_unsorted_outer(PlannerInfo *root,
*** 814,819 ****
--- 822,831 ----
          if (save_jointype == JOIN_UNIQUE_OUTER)
              continue;

+         /* Can't do anything else if inner has no unparameterized paths */
+         if (!inner_cheapest_total)
+             continue;
+
          /* Look for useful mergeclauses (if any) */
          mergeclauses = find_mergeclauses_for_pathkeys(root,
                                                        outerpath->pathkeys,
*************** hash_inner_and_outer(PlannerInfo *root,
*** 1092,1097 ****
--- 1104,1115 ----
          Path       *cheapest_total_outer = outerrel->cheapest_total_path;
          Path       *cheapest_total_inner = innerrel->cheapest_total_path;

+         /* Punt if either rel has only parameterized paths */
+         if (!cheapest_startup_outer ||
+             !cheapest_total_outer ||
+             !cheapest_total_inner)
+             return;
+
          /* Unique-ify if need be; we ignore parameterized possibilities */
          if (jointype == JOIN_UNIQUE_OUTER)
          {
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index e6a0f8dab6db311b816b2c139eb5dbc343e181ee..d7887da7a2a73ee282481a424b261359409de342 100644
*** a/src/backend/optimizer/path/joinrels.c
--- b/src/backend/optimizer/path/joinrels.c
*************** join_is_legal(PlannerInfo *root, RelOptI
*** 383,388 ****
--- 383,389 ----
          }
          else if (sjinfo->jointype == JOIN_SEMI &&
                   bms_equal(sjinfo->syn_righthand, rel2->relids) &&
+                  rel2->cheapest_total_path != NULL &&
                   create_unique_path(root, rel2, rel2->cheapest_total_path,
                                      sjinfo) != NULL)
          {
*************** join_is_legal(PlannerInfo *root, RelOptI
*** 416,421 ****
--- 417,423 ----
          }
          else if (sjinfo->jointype == JOIN_SEMI &&
                   bms_equal(sjinfo->syn_righthand, rel1->relids) &&
+                  rel1->cheapest_total_path != NULL &&
                   create_unique_path(root, rel1, rel1->cheapest_total_path,
                                      sjinfo) != NULL)
          {
*************** make_join_rel(PlannerInfo *root, RelOptI
*** 672,677 ****
--- 674,680 ----
               * anyway to be sure.)
               */
              if (bms_equal(sjinfo->syn_righthand, rel2->relids) &&
+                 rel2->cheapest_total_path != NULL &&
                  create_unique_path(root, rel2, rel2->cheapest_total_path,
                                     sjinfo) != NULL)
              {
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 414406bb8a13422dae87086aa1114b72683da0f1..572711c887bd62bd303bb98191d77ef7555c3167 100644
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static HashJoin *create_hashjoin_plan(Pl
*** 84,89 ****
--- 84,90 ----
                       Plan *outer_plan, Plan *inner_plan);
  static Node *replace_nestloop_params(PlannerInfo *root, Node *expr);
  static Node *replace_nestloop_params_mutator(Node *node, PlannerInfo *root);
+ static void identify_nestloop_extparams(PlannerInfo *root, Plan *subplan);
  static List *fix_indexqual_references(PlannerInfo *root, IndexPath *index_path);
  static List *fix_indexorderby_references(PlannerInfo *root, IndexPath *index_path);
  static Node *fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol);
*************** create_subqueryscan_plan(PlannerInfo *ro
*** 1640,1645 ****
--- 1641,1647 ----
      {
          scan_clauses = (List *)
              replace_nestloop_params(root, (Node *) scan_clauses);
+         identify_nestloop_extparams(root, best_path->parent->subplan);
      }

      scan_plan = make_subqueryscan(tlist,
*************** replace_nestloop_params_mutator(Node *no
*** 2560,2565 ****
--- 2562,2663 ----
  }

  /*
+  * identify_nestloop_extparams
+  *      Identify extParams of a parameterized subquery that need to be fed
+  *      from an outer nestloop.
+  *
+  * The subplan's references to the outer variables are already represented
+  * as PARAM_EXEC Params, so we need not modify the subplan here.  What we
+  * do need to do is add entries to root->curOuterParams to signal the parent
+  * nestloop plan node that it must provide these values.
+  */
+ static void
+ identify_nestloop_extparams(PlannerInfo *root, Plan *subplan)
+ {
+     Bitmapset  *tmpset;
+     int            paramid;
+
+     /* Examine each extParam of the subquery's plan */
+     tmpset = bms_copy(subplan->extParam);
+     while ((paramid = bms_first_member(tmpset)) >= 0)
+     {
+         PlannerParamItem *pitem = list_nth(root->glob->paramlist, paramid);
+
+         /* Ignore anything coming from an upper query level */
+         if (pitem->abslevel != root->query_level)
+             continue;
+
+         if (IsA(pitem->item, Var))
+         {
+             Var           *var = (Var *) pitem->item;
+             NestLoopParam *nlp;
+             ListCell   *lc;
+
+             /* If not from a nestloop outer rel, nothing to do */
+             if (!bms_is_member(var->varno, root->curOuterRels))
+                 continue;
+             /* Is this param already listed in root->curOuterParams? */
+             foreach(lc, root->curOuterParams)
+             {
+                 nlp = (NestLoopParam *) lfirst(lc);
+                 if (nlp->paramno == paramid)
+                 {
+                     Assert(equal(var, nlp->paramval));
+                     /* Present, so nothing to do */
+                     break;
+                 }
+             }
+             if (lc == NULL)
+             {
+                 /* No, so add it */
+                 nlp = makeNode(NestLoopParam);
+                 nlp->paramno = paramid;
+                 nlp->paramval = copyObject(var);
+                 root->curOuterParams = lappend(root->curOuterParams, nlp);
+             }
+         }
+         else if (IsA(pitem->item, PlaceHolderVar))
+         {
+             PlaceHolderVar *phv = (PlaceHolderVar *) pitem->item;
+             NestLoopParam *nlp;
+             ListCell   *lc;
+
+             /*
+              * If not from a nestloop outer rel, nothing to do.  We use
+              * bms_overlap as a cheap/quick test to see if the PHV might be
+              * evaluated in the outer rels, and then grab its PlaceHolderInfo
+              * to tell for sure.
+              */
+             if (!bms_overlap(phv->phrels, root->curOuterRels))
+                 continue;
+             if (!bms_is_subset(find_placeholder_info(root, phv, false)->ph_eval_at,
+                                root->curOuterRels))
+                 continue;
+             /* Is this param already listed in root->curOuterParams? */
+             foreach(lc, root->curOuterParams)
+             {
+                 nlp = (NestLoopParam *) lfirst(lc);
+                 if (nlp->paramno == paramid)
+                 {
+                     Assert(equal(phv, nlp->paramval));
+                     /* Present, so nothing to do */
+                     break;
+                 }
+             }
+             if (lc == NULL)
+             {
+                 /* No, so add it */
+                 nlp = makeNode(NestLoopParam);
+                 nlp->paramno = paramid;
+                 nlp->paramval = copyObject(phv);
+                 root->curOuterParams = lappend(root->curOuterParams, nlp);
+             }
+         }
+     }
+     bms_free(tmpset);
+ }
+
+ /*
   * fix_indexqual_references
   *      Adjust indexqual clauses to the form the executor's indexqual
   *      machinery needs.
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3c7fa632b8ebe26c1d91e83e0ba1fa9c03b9e919..1ab9f1b78a80d63cd5f4347c5193bd360db4e6a8 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*************** add_vars_to_targetlist(PlannerInfo *root
*** 204,209 ****
--- 204,262 ----
      }
  }

+ /*
+  * extract_lateral_references
+  *      If the specified RTE is a LATERAL subquery, extract all its references
+  *      to Vars of the current query level, and make sure those Vars will be
+  *      available for evaluation of the RTE.
+  *
+  * XXX this is rather duplicative of processing that has to happen elsewhere.
+  * Maybe it'd be a good idea to do this type of extraction further upstream
+  * and save the results?
+  */
+ static void
+ extract_lateral_references(PlannerInfo *root, int rtindex)
+ {
+     RangeTblEntry *rte = root->simple_rte_array[rtindex];
+
+     /* No cross-references are possible if it's not LATERAL */
+     if (!rte->lateral)
+         return;
+     if (rte->rtekind == RTE_SUBQUERY)
+     {
+         List       *vars = pull_vars_of_level((Node *) rte->subquery, 1);
+         List       *newvars;
+         Relids        where_needed;
+         ListCell   *lc;
+
+         /* Copy each Var and adjust it to match our level */
+         newvars = NIL;
+         foreach(lc, vars)
+         {
+             Var       *var = (Var *) lfirst(lc);
+
+             var = copyObject(var);
+             var->varlevelsup = 0;
+             newvars = lappend(newvars, var);
+         }
+
+         /*
+          * We mark the Vars as being "needed" at the LATERAL RTE.  This is a
+          * bit of a cheat: a more formal approach would be to mark each one as
+          * needed at the join of the LATERAL RTE with its source RTE.  But it
+          * will work, and it's much less tedious than computing a separate
+          * where_needed for each Var.
+          */
+         where_needed = bms_make_singleton(rtindex);
+
+         /* Push the Vars into their source relations' targetlists */
+         add_vars_to_targetlist(root, newvars, where_needed, false);
+
+         list_free(newvars);
+         list_free(vars);
+     }
+ }
+

  /*****************************************************************************
   *
*************** deconstruct_recurse(PlannerInfo *root, N
*** 286,292 ****
      {
          int            varno = ((RangeTblRef *) jtnode)->rtindex;

!         /* No quals to deal with, just return correct result */
          *qualscope = bms_make_singleton(varno);
          /* A single baserel does not create an inner join */
          *inner_join_rels = NULL;
--- 339,347 ----
      {
          int            varno = ((RangeTblRef *) jtnode)->rtindex;

!         /* No quals to deal with, but do check for LATERAL subqueries */
!         extract_lateral_references(root, varno);
!         /* Result qualscope is just the one Relid */
          *qualscope = bms_make_singleton(varno);
          /* A single baserel does not create an inner join */
          *inner_join_rels = NULL;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 31fe55707238de952e5eeae4736ea6d10789b0da..d184f49097d6964e45e2f0edac1fb1bf7a1bded3 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** plan_cluster_use_sort(Oid tableOid, Oid
*** 3257,3262 ****
--- 3257,3263 ----
      rte->rtekind = RTE_RELATION;
      rte->relid = tableOid;
      rte->relkind = RELKIND_RELATION;
+     rte->lateral = false;
      rte->inh = false;
      rte->inFromCl = true;
      query->rtable = list_make1(rte);
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 8ce6bee8561074e164fd42f4ea8ca8fd62861051..863c943f2a0ac06585443782e27cc525c311a054 100644
*** a/src/backend/optimizer/plan/subselect.c
--- b/src/backend/optimizer/plan/subselect.c
*************** convert_ANY_sublink_to_join(PlannerInfo
*** 1231,1236 ****
--- 1231,1237 ----
      rte = addRangeTableEntryForSubquery(NULL,
                                          subselect,
                                          makeAlias("ANY_subquery", NIL),
+                                         false,
                                          false);
      parse->rtable = lappend(parse->rtable, rte);
      rtindex = list_length(parse->rtable);
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index be1219eb3d12373cf166ab56697951839f437b95..06dbe84540444fa4d72b771b0d08137f3c1b4587 100644
*** a/src/backend/optimizer/prep/prepjointree.c
--- b/src/backend/optimizer/prep/prepjointree.c
*************** is_simple_subquery(Query *subquery)
*** 1176,1181 ****
--- 1176,1188 ----
          return false;

      /*
+      * Don't pull up a LATERAL subquery (hopefully, this is just a temporary
+      * implementation restriction).
+      */
+     if (contain_vars_of_level((Node *) subquery, 1))
+         return false;
+
+     /*
       * Don't pull up a subquery that has any set-returning functions in its
       * targetlist.    Otherwise we might well wind up inserting set-returning
       * functions into places where they mustn't go, such as quals of higher
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 00052f5c846ebe87ade0c19da26655b47c64cf73..9d5c97b7a422272177855cc7957269d9e9538e21 100644
*** a/src/backend/optimizer/util/pathnode.c
--- b/src/backend/optimizer/util/pathnode.c
*************** compare_path_costs_fuzzily(Path *path1,
*** 193,199 ****
   * and cheapest_total.    The cheapest_parameterized_paths list collects paths
   * that are cheapest-total for their parameterization (i.e., there is no
   * cheaper path with the same or weaker parameterization).    This list always
!  * includes the unparameterized cheapest-total path, too.
   *
   * This is normally called only after we've finished constructing the path
   * list for the rel node.
--- 193,199 ----
   * and cheapest_total.    The cheapest_parameterized_paths list collects paths
   * that are cheapest-total for their parameterization (i.e., there is no
   * cheaper path with the same or weaker parameterization).    This list always
!  * includes the unparameterized cheapest-total path, too, if there is one.
   *
   * This is normally called only after we've finished constructing the path
   * list for the rel node.
*************** set_cheapest(RelOptInfo *parent_rel)
*** 250,264 ****
              cheapest_total_path = path;
      }

!     if (cheapest_total_path == NULL)
          elog(ERROR, "could not devise a query plan for the given query");

      parent_rel->cheapest_startup_path = cheapest_startup_path;
      parent_rel->cheapest_total_path = cheapest_total_path;
      parent_rel->cheapest_unique_path = NULL;    /* computed only if needed */

!     /* Seed the parameterized-paths list with the cheapest total */
!     parent_rel->cheapest_parameterized_paths = list_make1(cheapest_total_path);

      /* And, if there are any parameterized paths, add them in one at a time */
      if (have_parameterized_paths)
--- 250,267 ----
              cheapest_total_path = path;
      }

!     if (cheapest_total_path == NULL && !have_parameterized_paths)
          elog(ERROR, "could not devise a query plan for the given query");

      parent_rel->cheapest_startup_path = cheapest_startup_path;
      parent_rel->cheapest_total_path = cheapest_total_path;
      parent_rel->cheapest_unique_path = NULL;    /* computed only if needed */

!     /* Seed the parameterized-paths list with the cheapest total, if any */
!     if (cheapest_total_path)
!         parent_rel->cheapest_parameterized_paths = list_make1(cheapest_total_path);
!     else
!         parent_rel->cheapest_parameterized_paths = NIL;

      /* And, if there are any parameterized paths, add them in one at a time */
      if (have_parameterized_paths)
diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index 9bc90c253139deb43415683800681f48fe0c8436..0e66720c60fb32571f9a348df852aebc1319595e 100644
*** a/src/backend/optimizer/util/var.c
--- b/src/backend/optimizer/util/var.c
*************** typedef struct
*** 42,47 ****
--- 42,53 ----

  typedef struct
  {
+     List       *vars;
+     int            sublevels_up;
+ } pull_vars_context;
+
+ typedef struct
+ {
      int            var_location;
      int            sublevels_up;
  } locate_var_of_level_context;
*************** typedef struct
*** 77,82 ****
--- 83,89 ----
  static bool pull_varnos_walker(Node *node,
                     pull_varnos_context *context);
  static bool pull_varattnos_walker(Node *node, pull_varattnos_context *context);
+ static bool pull_vars_walker(Node *node, pull_vars_context *context);
  static bool contain_var_clause_walker(Node *node, void *context);
  static bool contain_vars_of_level_walker(Node *node, int *sublevels_up);
  static bool locate_var_of_level_walker(Node *node,
*************** pull_varnos(Node *node)
*** 122,127 ****
--- 129,159 ----
      return context.varnos;
  }

+ /*
+  * pull_varnos_of_level
+  *        Create a set of all the distinct varnos present in a parsetree.
+  *        Only Vars of the specified level are considered.
+  */
+ Relids
+ pull_varnos_of_level(Node *node, int levelsup)
+ {
+     pull_varnos_context context;
+
+     context.varnos = NULL;
+     context.sublevels_up = levelsup;
+
+     /*
+      * Must be prepared to start with a Query or a bare expression tree; if
+      * it's a Query, we don't want to increment sublevels_up.
+      */
+     query_or_expression_tree_walker(node,
+                                     pull_varnos_walker,
+                                     (void *) &context,
+                                     0);
+
+     return context.varnos;
+ }
+
  static bool
  pull_varnos_walker(Node *node, pull_varnos_context *context)
  {
*************** pull_varattnos_walker(Node *node, pull_v
*** 231,236 ****
--- 263,328 ----


  /*
+  * pull_vars_of_level
+  *        Create a list of all Vars referencing the specified query level
+  *        in the given parsetree.
+  *
+  * This is used on unplanned parsetrees, so we don't expect to see any
+  * PlaceHolderVars.
+  *
+  * Caution: the Vars are not copied, only linked into the list.
+  */
+ List *
+ pull_vars_of_level(Node *node, int levelsup)
+ {
+     pull_vars_context context;
+
+     context.vars = NIL;
+     context.sublevels_up = levelsup;
+
+     /*
+      * Must be prepared to start with a Query or a bare expression tree; if
+      * it's a Query, we don't want to increment sublevels_up.
+      */
+     query_or_expression_tree_walker(node,
+                                     pull_vars_walker,
+                                     (void *) &context,
+                                     0);
+
+     return context.vars;
+ }
+
+ static bool
+ pull_vars_walker(Node *node, pull_vars_context *context)
+ {
+     if (node == NULL)
+         return false;
+     if (IsA(node, Var))
+     {
+         Var           *var = (Var *) node;
+
+         if (var->varlevelsup == context->sublevels_up)
+             context->vars = lappend(context->vars, var);
+         return false;
+     }
+     Assert(!IsA(node, PlaceHolderVar));
+     if (IsA(node, Query))
+     {
+         /* Recurse into RTE subquery or not-yet-planned sublink subquery */
+         bool        result;
+
+         context->sublevels_up++;
+         result = query_tree_walker((Query *) node, pull_vars_walker,
+                                    (void *) context, 0);
+         context->sublevels_up--;
+         return result;
+     }
+     return expression_tree_walker(node, pull_vars_walker,
+                                   (void *) context);
+ }
+
+
+ /*
   * contain_var_clause
   *      Recursively scan a clause to discover whether it contains any Var nodes
   *      (of the current query level).
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 263edb5a7a61d3a3c2ea5f1a5f9a255024d86bad..0492394bc8378ffd68086f4bd72bba654690fd42 100644
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** transformInsertStmt(ParseState *pstate,
*** 533,538 ****
--- 533,539 ----
          rte = addRangeTableEntryForSubquery(pstate,
                                              selectQuery,
                                              makeAlias("*SELECT*", NIL),
+                                             false,
                                              false);
          rtr = makeNode(RangeTblRef);
          /* assume new rte is at end */
*************** transformSetOperationTree(ParseState *ps
*** 1652,1657 ****
--- 1653,1659 ----
          rte = addRangeTableEntryForSubquery(pstate,
                                              selectQuery,
                                              makeAlias(selectName, NIL),
+                                             false,
                                              false);

          /*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 6b6901197db61ea2a560bd4bd4d186e78c8ddf84..11cf0addc236c832cb99a162f99433ae052594c5 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*************** static void processCASbits(int cas_bits,
*** 532,540 ****

      KEY

!     LABEL LANGUAGE LARGE_P LAST_P LC_COLLATE_P LC_CTYPE_P LEADING LEAKPROOF
!     LEAST LEFT LEVEL LIKE LIMIT LISTEN LOAD LOCAL LOCALTIME LOCALTIMESTAMP
!     LOCATION LOCK_P

      MAPPING MATCH MAXVALUE MINUTE_P MINVALUE MODE MONTH_P MOVE

--- 532,540 ----

      KEY

!     LABEL LANGUAGE LARGE_P LAST_P LATERAL_P LC_COLLATE_P LC_CTYPE_P
!     LEADING LEAKPROOF LEAST LEFT LEVEL LIKE LIMIT LISTEN LOAD LOCAL
!     LOCALTIME LOCALTIMESTAMP LOCATION LOCK_P

      MAPPING MATCH MAXVALUE MINUTE_P MINVALUE MODE MONTH_P MOVE

*************** table_ref:    relation_expr
*** 9394,9406 ****
--- 9394,9433 ----
                                   parser_errposition(@1)));
                      $$ = NULL;
                  }
+             | LATERAL_P select_with_parens
+                 {
+                     /* As above, with LATERAL */
+                     if (IsA($2, SelectStmt) &&
+                         ((SelectStmt *) $2)->valuesLists)
+                         ereport(ERROR,
+                                 (errcode(ERRCODE_SYNTAX_ERROR),
+                                  errmsg("VALUES in FROM must have an alias"),
+                                  errhint("For example, FROM (VALUES ...) [AS] foo."),
+                                  parser_errposition(@2)));
+                     else
+                         ereport(ERROR,
+                                 (errcode(ERRCODE_SYNTAX_ERROR),
+                                  errmsg("subquery in FROM must have an alias"),
+                                  errhint("For example, FROM (SELECT ...) [AS] foo."),
+                                  parser_errposition(@2)));
+                     $$ = NULL;
+                 }
              | select_with_parens alias_clause
                  {
                      RangeSubselect *n = makeNode(RangeSubselect);
+                     n->lateral = false;
                      n->subquery = $1;
                      n->alias = $2;
                      $$ = (Node *) n;
                  }
+             | LATERAL_P select_with_parens alias_clause
+                 {
+                     RangeSubselect *n = makeNode(RangeSubselect);
+                     n->lateral = true;
+                     n->subquery = $2;
+                     n->alias = $3;
+                     $$ = (Node *) n;
+                 }
              | joined_table
                  {
                      $$ = (Node *) $1;
*************** col_name_keyword:
*** 12618,12623 ****
--- 12645,12651 ----
              | INT_P
              | INTEGER
              | INTERVAL
+             | LATERAL_P
              | LEAST
              | NATIONAL
              | NCHAR
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 97ab9d5581a2ad379a2b5c7928dbc7c7e5bcbcb8..9f0558ae18dbdc60a1b71e22df44894044a859af 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*************** transformRangeSubselect(ParseState *psta
*** 503,515 ****
       * the other variables weren't even visible, but it seems more useful to
       * leave them visible and give a specific error message.
       *
!      * XXX this will need further work to support SQL99's LATERAL() feature,
!      * wherein such references would indeed be legal.
       *
       * We can skip groveling through the subquery if there's not anything
       * visible in the current query.  Also note that outer references are OK.
       */
!     if (pstate->p_relnamespace || pstate->p_varnamespace)
      {
          if (contain_vars_of_level((Node *) query, 1))
              ereport(ERROR,
--- 503,516 ----
       * the other variables weren't even visible, but it seems more useful to
       * leave them visible and give a specific error message.
       *
!      * XXX this needs further work to support LATERAL subqueries.  For the
!      * moment, just assume any cross-reference in a LATERAL subquery is OK.
       *
       * We can skip groveling through the subquery if there's not anything
       * visible in the current query.  Also note that outer references are OK.
       */
!     if (!r->lateral &&
!         (pstate->p_relnamespace || pstate->p_varnamespace))
      {
          if (contain_vars_of_level((Node *) query, 1))
              ereport(ERROR,
*************** transformRangeSubselect(ParseState *psta
*** 522,528 ****
      /*
       * OK, build an RTE for the subquery.
       */
!     rte = addRangeTableEntryForSubquery(pstate, query, r->alias, true);

      return rte;
  }
--- 523,533 ----
      /*
       * OK, build an RTE for the subquery.
       */
!     rte = addRangeTableEntryForSubquery(pstate,
!                                         query,
!                                         r->alias,
!                                         r->lateral,
!                                         true);

      return rte;
  }
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 30b307b191cc4e5304dbc99e0aea95046720bd46..2477d07aaf5c3212c39200101b23e2136b12988c 100644
*** a/src/backend/parser/parse_relation.c
--- b/src/backend/parser/parse_relation.c
*************** addRangeTableEntry(ParseState *pstate,
*** 917,932 ****
       */
      heap_close(rel, NoLock);

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
       * The initial default on access checks is always check-for-READ-access,
       * which is the right thing for all except target tables.
-      *----------
       */
      rte->inh = inh;
      rte->inFromCl = inFromCl;

--- 917,929 ----
       */
      heap_close(rel, NoLock);

!     /*
!      * Set flags and access permissions.
       *
       * The initial default on access checks is always check-for-READ-access,
       * which is the right thing for all except target tables.
       */
+     rte->lateral = false;
      rte->inh = inh;
      rte->inFromCl = inFromCl;

*************** addRangeTableEntryForRelation(ParseState
*** 973,988 ****
      rte->eref = makeAlias(refname, NIL);
      buildRelationAliases(rel->rd_att, alias, rte->eref);

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
       * The initial default on access checks is always check-for-READ-access,
       * which is the right thing for all except target tables.
-      *----------
       */
      rte->inh = inh;
      rte->inFromCl = inFromCl;

--- 970,982 ----
      rte->eref = makeAlias(refname, NIL);
      buildRelationAliases(rel->rd_att, alias, rte->eref);

!     /*
!      * Set flags and access permissions.
       *
       * The initial default on access checks is always check-for-READ-access,
       * which is the right thing for all except target tables.
       */
+     rte->lateral = false;
      rte->inh = inh;
      rte->inFromCl = inFromCl;

*************** RangeTblEntry *
*** 1011,1016 ****
--- 1005,1011 ----
  addRangeTableEntryForSubquery(ParseState *pstate,
                                Query *subquery,
                                Alias *alias,
+                               bool lateral,
                                bool inFromCl)
  {
      RangeTblEntry *rte = makeNode(RangeTblEntry);
*************** addRangeTableEntryForSubquery(ParseState
*** 1054,1068 ****

      rte->eref = eref;

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
       * Subqueries are never checked for access rights.
-      *----------
       */
      rte->inh = false;            /* never true for subqueries */
      rte->inFromCl = inFromCl;

--- 1049,1060 ----

      rte->eref = eref;

!     /*
!      * Set flags and access permissions.
       *
       * Subqueries are never checked for access rights.
       */
+     rte->lateral = lateral;
      rte->inh = false;            /* never true for subqueries */
      rte->inFromCl = inFromCl;

*************** addRangeTableEntryForFunction(ParseState
*** 1192,1207 ****
                      funcname, format_type_be(funcrettype)),
                   parser_errposition(pstate, exprLocation(funcexpr))));

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
!      * Functions are never checked for access rights (at least, not by
!      * the RTE permissions mechanism).
!      *----------
       */
      rte->inh = false;            /* never true for functions */
      rte->inFromCl = inFromCl;

--- 1184,1196 ----
                      funcname, format_type_be(funcrettype)),
                   parser_errposition(pstate, exprLocation(funcexpr))));

!     /*
!      * Set flags and access permissions.
       *
!      * Functions are never checked for access rights (at least, not by the RTE
!      * permissions mechanism).
       */
+     rte->lateral = false;
      rte->inh = false;            /* never true for functions */
      rte->inFromCl = inFromCl;

*************** addRangeTableEntryForValues(ParseState *
*** 1267,1281 ****

      rte->eref = eref;

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
       * Subqueries are never checked for access rights.
-      *----------
       */
      rte->inh = false;            /* never true for values RTEs */
      rte->inFromCl = inFromCl;

--- 1256,1267 ----

      rte->eref = eref;

!     /*
!      * Set flags and access permissions.
       *
       * Subqueries are never checked for access rights.
       */
+     rte->lateral = false;
      rte->inh = false;            /* never true for values RTEs */
      rte->inFromCl = inFromCl;

*************** addRangeTableEntryForJoin(ParseState *ps
*** 1338,1352 ****

      rte->eref = eref;

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
       * Joins are never checked for access rights.
-      *----------
       */
      rte->inh = false;            /* never true for joins */
      rte->inFromCl = inFromCl;

--- 1324,1335 ----

      rte->eref = eref;

!     /*
!      * Set flags and access permissions.
       *
       * Joins are never checked for access rights.
       */
+     rte->lateral = false;
      rte->inh = false;            /* never true for joins */
      rte->inFromCl = inFromCl;

*************** addRangeTableEntryForCTE(ParseState *pst
*** 1441,1455 ****

      rte->eref = eref;

!     /*----------
!      * Flags:
!      * - this RTE should be expanded to include descendant tables,
!      * - this RTE is in the FROM clause,
!      * - this RTE should be checked for appropriate access rights.
       *
       * Subqueries are never checked for access rights.
-      *----------
       */
      rte->inh = false;            /* never true for subqueries */
      rte->inFromCl = inFromCl;

--- 1424,1435 ----

      rte->eref = eref;

!     /*
!      * Set flags and access permissions.
       *
       * Subqueries are never checked for access rights.
       */
+     rte->lateral = false;
      rte->inh = false;            /* never true for subqueries */
      rte->inFromCl = inFromCl;

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 412dfe6f9aac5f49d618666866966809fb900e47..01030219619ee5f122dc91ee4ce4ab3444ee8999 100644
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
*************** pg_get_triggerdef_worker(Oid trigid, boo
*** 676,681 ****
--- 676,682 ----
          oldrte->relid = trigrec->tgrelid;
          oldrte->relkind = relkind;
          oldrte->eref = makeAlias("old", NIL);
+         oldrte->lateral = false;
          oldrte->inh = false;
          oldrte->inFromCl = true;

*************** pg_get_triggerdef_worker(Oid trigid, boo
*** 684,689 ****
--- 685,691 ----
          newrte->relid = trigrec->tgrelid;
          newrte->relkind = relkind;
          newrte->eref = makeAlias("new", NIL);
+         newrte->lateral = false;
          newrte->inh = false;
          newrte->inFromCl = true;

*************** deparse_context_for(const char *aliasnam
*** 2174,2179 ****
--- 2176,2182 ----
      rte->relid = relid;
      rte->relkind = RELKIND_RELATION;    /* no need for exactness here */
      rte->eref = makeAlias(aliasname, NIL);
+     rte->lateral = false;
      rte->inh = false;
      rte->inFromCl = true;

*************** get_from_clause_item(Node *jtnode, Query
*** 6618,6623 ****
--- 6621,6629 ----
          RangeTblEntry *rte = rt_fetch(varno, query->rtable);
          bool        gavealias = false;

+         if (rte->lateral)
+             appendStringInfoString(buf, "LATERAL ");
+
          switch (rte->rtekind)
          {
              case RTE_RELATION:
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 119e1ed2f6e18872cfd3758b9303fb2835e5695a..5d53060a70d06b56c3afb8fdf5db3fbfb091d53c 100644
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct WindowDef
*** 451,456 ****
--- 451,457 ----
  typedef struct RangeSubselect
  {
      NodeTag        type;
+     bool        lateral;        /* does it have LATERAL prefix? */
      Node       *subquery;        /* the untransformed sub-select clause */
      Alias       *alias;            /* table alias & optional column aliases */
  } RangeSubselect;
*************** typedef struct RangeTblEntry
*** 706,712 ****
       * Fields valid for a subquery RTE (else NULL):
       */
      Query       *subquery;        /* the sub-query */
!     bool        security_barrier;        /* subquery from security_barrier view */

      /*
       * Fields valid for a join RTE (else NULL/zero):
--- 707,713 ----
       * Fields valid for a subquery RTE (else NULL):
       */
      Query       *subquery;        /* the sub-query */
!     bool        security_barrier;        /* is from security_barrier view? */

      /*
       * Fields valid for a join RTE (else NULL/zero):
*************** typedef struct RangeTblEntry
*** 756,761 ****
--- 757,763 ----
       */
      Alias       *alias;            /* user-written alias clause, if any */
      Alias       *eref;            /* expanded reference names */
+     bool        lateral;        /* subquery or function is marked LATERAL? */
      bool        inh;            /* inheritance requested? */
      bool        inFromCl;        /* present in FROM clause? */
      AclMode        requiredPerms;    /* bitmask of required access permissions */
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index cf0bbd9f159e6fc168bce1a448527c3f901fff71..8238981c2894553a02d23745886f7675d77d7775 100644
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
*************** typedef struct PlannerInfo
*** 307,321 ****
   *        ppilist - ParamPathInfo nodes for parameterized Paths, if any
   *        cheapest_startup_path - the pathlist member with lowest startup cost
   *                                (regardless of its ordering; but must be
!  *                                 unparameterized)
   *        cheapest_total_path - the pathlist member with lowest total cost
   *                              (regardless of its ordering; but must be
!  *                               unparameterized)
   *        cheapest_unique_path - for caching cheapest path to produce unique
   *                               (no duplicates) output from relation
   *        cheapest_parameterized_paths - paths with cheapest total costs for
   *                                 their parameterizations; always includes
!  *                                 cheapest_total_path
   *
   * If the relation is a base relation it will have these fields set:
   *
--- 307,323 ----
   *        ppilist - ParamPathInfo nodes for parameterized Paths, if any
   *        cheapest_startup_path - the pathlist member with lowest startup cost
   *                                (regardless of its ordering; but must be
!  *                                 unparameterized; hence will be NULL for
!  *                                 a LATERAL subquery)
   *        cheapest_total_path - the pathlist member with lowest total cost
   *                              (regardless of its ordering; but must be
!  *                               unparameterized; hence will be NULL for
!  *                               a LATERAL subquery)
   *        cheapest_unique_path - for caching cheapest path to produce unique
   *                               (no duplicates) output from relation
   *        cheapest_parameterized_paths - paths with cheapest total costs for
   *                                 their parameterizations; always includes
!  *                                 cheapest_total_path, if that exists
   *
   * If the relation is a base relation it will have these fields set:
   *
diff --git a/src/include/optimizer/var.h b/src/include/optimizer/var.h
index f546362b680ca5e6f2f1a41883755b3cc6235e15..f10b838dc7f9cfdaf255cea2aa5c6c78977d4ffe 100644
*** a/src/include/optimizer/var.h
--- b/src/include/optimizer/var.h
*************** typedef enum
*** 31,37 ****
--- 31,39 ----
  } PVCPlaceHolderBehavior;

  extern Relids pull_varnos(Node *node);
+ extern Relids pull_varnos_of_level(Node *node, int levelsup);
  extern void pull_varattnos(Node *node, Index varno, Bitmapset **varattnos);
+ extern List *pull_vars_of_level(Node *node, int levelsup);
  extern bool contain_var_clause(Node *node);
  extern bool contain_vars_of_level(Node *node, int levelsup);
  extern int    locate_var_of_level(Node *node, int levelsup);
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7e55a92185b64e0eeac830c534c18b071bb4313a..3a0bc8f849e56b4a948ba3ab748ebaf28646c0c4 100644
*** a/src/include/parser/kwlist.h
--- b/src/include/parser/kwlist.h
*************** PG_KEYWORD("label", LABEL, UNRESERVED_KE
*** 213,218 ****
--- 213,219 ----
  PG_KEYWORD("language", LANGUAGE, UNRESERVED_KEYWORD)
  PG_KEYWORD("large", LARGE_P, UNRESERVED_KEYWORD)
  PG_KEYWORD("last", LAST_P, UNRESERVED_KEYWORD)
+ PG_KEYWORD("lateral", LATERAL_P, COL_NAME_KEYWORD)
  PG_KEYWORD("lc_collate", LC_COLLATE_P, UNRESERVED_KEYWORD)
  PG_KEYWORD("lc_ctype", LC_CTYPE_P, UNRESERVED_KEYWORD)
  PG_KEYWORD("leading", LEADING, RESERVED_KEYWORD)
diff --git a/src/include/parser/parse_relation.h b/src/include/parser/parse_relation.h
index ababd74f82f55cd80879a3b39a209417b2bae431..fa99066a4efb9b97cd350f9230387adb15a58232 100644
*** a/src/include/parser/parse_relation.h
--- b/src/include/parser/parse_relation.h
*************** extern RangeTblEntry *addRangeTableEntry
*** 55,60 ****
--- 55,61 ----
  extern RangeTblEntry *addRangeTableEntryForSubquery(ParseState *pstate,
                                Query *subquery,
                                Alias *alias,
+                               bool lateral,
                                bool inFromCl);
  extern RangeTblEntry *addRangeTableEntryForFunction(ParseState *pstate,
                                char *funcname,

Re: WIP patch for LATERAL subqueries

From
Andrew Dunstan
Date:
On 08/05/2012 05:58 PM, Tom Lane wrote:
> I've been idly amusing myself by trying to hack up support for
> SQL-standard LATERAL subqueries.  I've got something that turns over,
> more or less:


Awesome!!

>
> Currently the patch only implements the syntax called out in the standard,
> namely that you can put LATERAL in front of a <derived table>, which is
> to say a parenthesized sub-SELECT in FROM.  It strikes me that it might be
> worth allowing LATERAL with a function-in-FROM as well.  So basically
>     LATERAL func(args) <alias>
> would be an allowed abbreviation for
>     LATERAL (SELECT * FROM func(args)) <alias>
> Since the standard doesn't have function-in-FROM, it has nothing to say
> about whether this is sane or not.  The argument for this is mainly that
> SRFs are one of the main use-cases for LATERAL (replacing SRF-in-the-
> SELECT-list usages), so we might as well make it convenient.  Any opinions
> pro or con about that?


Pro. As you say this is the main use case, and the longer syntax just 
seems unnecessary fluff.

I'll comment on the rest of you email later, but this is just great 
news. Hardly a month goes by that I don't wish for LATERAL.


cheers

andrew


Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
I wrote:
> While fooling around in the planner I realized that I have no idea what
> outer-level aggregates mean in a LATERAL subquery, and neither does
> Postgres:
> regression=# select 1 from tenk1 a, lateral (select * from int4_tbl b where f1 = max(a.unique1)) x;
> ERROR:  plan should not reference subplan's variable
> I don't see anything prohibiting this in SQL:2008, but ordinarily this
> would be taken to be an outer-level aggregate, and surely that is not
> sensible in the LATERAL subquery.  For the moment it seems like a good
> idea to disallow it, though I am not sure where is a convenient place
> to test for such things.  Has anyone got a clue about whether this is
> well-defined, or is it simply an oversight in the spec?

On further reflection I think this is indeed disallowed by spec.  The
outer query is clearly the "aggregation query" of the aggregate, and the
aggregate appears inside that query's FROM list, therefore it's no good;
see SQL:2008 6.9 <set function specification> syntax rules 6 and 7.
(I missed this before because it's not under the aggregate function
heading.)

So the problem here is just that parseCheckAggregates neglects to grovel
through subqueries-in-FROM looking for aggregates of the current level.
Since AFAICS the case cannot arise without LATERAL, this isn't really a
pre-existing bug.

I find it fairly annoying though that parseCheckAggregates (and likewise
parseCheckWindowFuncs) have to dig through previously parsed query trees
to look for misplaced aggregates; so adding even more of that is grating
on me.  It would be a lot cleaner if transformAggregateCall and
transformWindowFuncCall could throw these errors immediately.  The
reason they can't is lack of context about what portion of the query we
are currently parsing.  I'm thinking it'd be worthwhile to add an enum
field to ParseState that shows whether we're currently parsing the
associated query level's target list, WHERE clause, GROUP BY clause,
etc.  The easiest way to ensure this gets set for all cases should be to
add the enum value as another argument to transformExpr(), which
would then save it into the ParseState for access by subsidiary
expression transformation functions.

Thoughts?
        regards, tom lane


Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 08/05/2012 05:58 PM, Tom Lane wrote:
>> Currently the patch only implements the syntax called out in the standard,
>> namely that you can put LATERAL in front of a <derived table>, which is
>> to say a parenthesized sub-SELECT in FROM.  It strikes me that it might be
>> worth allowing LATERAL with a function-in-FROM as well.

> Pro. As you say this is the main use case, and the longer syntax just 
> seems unnecessary fluff.

After some experimentation it seems that this only works if we promote
LATERAL to a fully reserved keyword.  Apparently the reason is that
given non-reserved LATERAL followed by an identifier, it's not clear
without additional lookahead whether we have "LATERAL func_name ..."
or the LATERAL is a table name and the identifier is an alias.  And the
parser has to make a shift/reduce decision before it can look beyond the
identifier.  (Without the LATERAL func_name syntax, there's no ambiguity
because LATERAL in its keyword meaning must be immediately followed by a
left paren.)

Since LATERAL has been a reserved word in every SQL spec since SQL:99,
I don't feel too bad about making it fully reserved for us too, but
nonetheless this is a cost of adding this syntax.
        regards, tom lane


Re: WIP patch for LATERAL subqueries

From
Pavel Stehule
Date:
2012/8/6 Tom Lane <tgl@sss.pgh.pa.us>:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 08/05/2012 05:58 PM, Tom Lane wrote:
>>> Currently the patch only implements the syntax called out in the standard,
>>> namely that you can put LATERAL in front of a <derived table>, which is
>>> to say a parenthesized sub-SELECT in FROM.  It strikes me that it might be
>>> worth allowing LATERAL with a function-in-FROM as well.
>
>> Pro. As you say this is the main use case, and the longer syntax just
>> seems unnecessary fluff.
>
> After some experimentation it seems that this only works if we promote
> LATERAL to a fully reserved keyword.  Apparently the reason is that
> given non-reserved LATERAL followed by an identifier, it's not clear
> without additional lookahead whether we have "LATERAL func_name ..."
> or the LATERAL is a table name and the identifier is an alias.  And the
> parser has to make a shift/reduce decision before it can look beyond the
> identifier.  (Without the LATERAL func_name syntax, there's no ambiguity
> because LATERAL in its keyword meaning must be immediately followed by a
> left paren.)
>
> Since LATERAL has been a reserved word in every SQL spec since SQL:99,
> I don't feel too bad about making it fully reserved for us too, but
> nonetheless this is a cost of adding this syntax.

+1

Pavel

>
>                         regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: WIP patch for LATERAL subqueries

From
Robert Haas
Date:
On Sun, Aug 5, 2012 at 5:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I've been idly amusing myself by trying to hack up support for
> SQL-standard LATERAL subqueries.

Cool!

> Currently the patch only implements the syntax called out in the standard,
> namely that you can put LATERAL in front of a <derived table>, which is
> to say a parenthesized sub-SELECT in FROM.  It strikes me that it might be
> worth allowing LATERAL with a function-in-FROM as well.  So basically
>         LATERAL func(args) <alias>
> would be an allowed abbreviation for
>         LATERAL (SELECT * FROM func(args)) <alias>
> Since the standard doesn't have function-in-FROM, it has nothing to say
> about whether this is sane or not.  The argument for this is mainly that
> SRFs are one of the main use-cases for LATERAL (replacing SRF-in-the-
> SELECT-list usages), so we might as well make it convenient.  Any opinions
> pro or con about that?

Apparently Sybase and Microsoft SQL server use a slightly different
syntax, CROSS APPLY, for this.

http://iablog.sybase.com/paulley/2008/07/cross-and-outer-apply/

It may make sense to consider mimicking that instead of inventing our
own way of doing it, but I haven't investigated much so it's also
possible that it doesn't make sense.

> While fooling around in the planner I realized that I have no idea what
> outer-level aggregates mean in a LATERAL subquery, and neither does
> Postgres:
> regression=# select 1 from tenk1 a, lateral (select * from int4_tbl b where f1 = max(a.unique1)) x;
> ERROR:  plan should not reference subplan's variable
> I don't see anything prohibiting this in SQL:2008, but ordinarily this
> would be taken to be an outer-level aggregate, and surely that is not
> sensible in the LATERAL subquery.  For the moment it seems like a good
> idea to disallow it, though I am not sure where is a convenient place
> to test for such things.  Has anyone got a clue about whether this is
> well-defined, or is it simply an oversight in the spec?

My mental picture of LATERAL (which might be inaccurate) is that it
has the semantics that you'd get from a parameterized nestloop.  So I
can't assign any meaning to that either.

> Comments, better ideas?

Thanks for working on this - sorry I don't have more thoughts right at
the moment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP patch for LATERAL subqueries

From
Amit Kapila
Date:
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
Sent: Monday, August 06, 2012 6:16 PM
On Sun, Aug 5, 2012 at 5:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>> Currently the patch only implements the syntax called out in the
standard,
>> namely that you can put LATERAL in front of a <derived table>, which is
>> to say a parenthesized sub-SELECT in FROM.  It strikes me that it might
be
>> worth allowing LATERAL with a function-in-FROM as well.  So basically
>>         LATERAL func(args) <alias>
>> would be an allowed abbreviation for
>>         LATERAL (SELECT * FROM func(args)) <alias>
>> Since the standard doesn't have function-in-FROM, it has nothing to say
>> about whether this is sane or not.  The argument for this is mainly that
>> SRFs are one of the main use-cases for LATERAL (replacing SRF-in-the-
>> SELECT-list usages), so we might as well make it convenient.  Any
opinions
>> pro or con about that?

> Apparently Sybase and Microsoft SQL server use a slightly different
> syntax, CROSS APPLY, for this.

> http://iablog.sybase.com/paulley/2008/07/cross-and-outer-apply/

> It may make sense to consider mimicking that instead of inventing our
> own way of doing it, but I haven't investigated much so it's also
> possible that it doesn't make sense.

There are certain differences mentioned in the link due to which I am 
not sure it can be mimicked exactly, and may be that's why Sybase also has
both syntaxes.

Differences
-----------------
The LATERAL keyword cannot NULL-supply rows, whereas an OUTER APPLY will.
Additionally, with a LATERAL derived table, the derived table and the outer
reference must be separated by a comma. For an APPLY operator, the table
expression on the right and the outer reference cannot be separated by a
comma, but they can be separated by any other join operator. In other words,
the APPLY operator allows references to any table within the left table
expression, whereas the LATERAL keyword allows references to tables outside
the current table expression.


With Regards,
Amit Kapila.




Re: WIP patch for LATERAL subqueries

From
Amit Kapila
Date:
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Monday, August 06, 2012 3:28 AM
To: pgsql-hackers@postgreSQL.org
Subject: [HACKERS] WIP patch for LATERAL subqueries

> I've been idly amusing myself by trying to hack up support for
SQL-standard LATERAL subqueries.  I've got something 
> that turns over, more or less:

> regression=# select * from int4_tbl a, lateral (select unique1,unique2
from tenk1 b where a.f1 = unique1) x;
> f1 | unique1 | unique2 
> ----+---------+---------
>  0 |       0 |    9998
> (1 row)

> regression=# explain select * from int4_tbl a, lateral (select
unique1,unique2 from tenk1 b where a.f1 = unique1) x;
>                                    QUERY PLAN

>
----------------------------------------------------------------------------
-------
> Nested Loop  (cost=0.00..42.55 rows=5 width=12)
>   ->  Seq Scan on int4_tbl a  (cost=0.00..1.05 rows=5 width=4)
>   ->  Index Scan using tenk1_unique1 on tenk1 b  (cost=0.00..8.28 rows=1
width=8)
>         Index Cond: (a.f1 = unique1)
> (4 rows)

> but there's a good deal of work left to do, some of which could use some
discussion.

> Feature/semantics issues:

> Currently the patch only implements the syntax called out in the standard,
namely that you can put LATERAL in front of > a <derived table>, which is to
say a parenthesized sub-SELECT in FROM.  It strikes me that it might be
worth allowing 
> LATERAL with a function-in-FROM as well.  So basically
>     LATERAL func(args) <alias>
> would be an allowed abbreviation for
>    LATERAL (SELECT * FROM func(args)) <alias> Since the standard
doesn't have function-in-FROM, it has nothing to say > about whether this is
sane or not.  The argument for this is mainly that SRFs are one of the main
use-cases for LATERAL > (replacing SRF-in-the- SELECT-list usages), so we
might as well make it convenient.  Any opinions pro or con about 
> that?

I have checked Sybase also has similar syntax for functions by other keyword
APPLY. So this should be good way to specify.

> While fooling around in the planner I realized that I have no idea what
outer-level aggregates mean in a LATERAL 
> subquery, and neither does
> Postgres:
> regression=# select 1 from tenk1 a, lateral (select * from int4_tbl b
where f1 = max(a.unique1)) x;
> ERROR:  plan should not reference subplan's variable I don't see anything
prohibiting this in SQL:2008, but ordinarily > this would be taken to be an
outer-level aggregate, and surely that is not sensible in the LATERAL
subquery.  For the 
> moment it seems like a good idea to disallow it, though I am not sure
where is a convenient place to test for such 
> things.  Has anyone got a clue about whether this is well-defined, or is
it simply an oversight in the spec?

I have checked in Oracle and it gives error in such query:
SQL> select * from t1, Lateral (select * from t2 where t2.c2 = max(t1.c1)); 
select * from t1, Lateral (select * from t2 where t2.c2 = max(t1.c1))
      * 
 
ERROR at line 1: 
ORA-00934: group function is not allowed here


With Regards,
Amit Kapila.




Re: WIP patch for LATERAL subqueries

From
Robert Haas
Date:
On Mon, Aug 6, 2012 at 10:07 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>>> Currently the patch only implements the syntax called out in the
> standard,
>>> namely that you can put LATERAL in front of a <derived table>, which is
>>> to say a parenthesized sub-SELECT in FROM.  It strikes me that it might
> be
>>> worth allowing LATERAL with a function-in-FROM as well.  So basically
>>>         LATERAL func(args) <alias>
>>> would be an allowed abbreviation for
>>>         LATERAL (SELECT * FROM func(args)) <alias>
>>> Since the standard doesn't have function-in-FROM, it has nothing to say
>>> about whether this is sane or not.  The argument for this is mainly that
>>> SRFs are one of the main use-cases for LATERAL (replacing SRF-in-the-
>>> SELECT-list usages), so we might as well make it convenient.  Any
> opinions
>>> pro or con about that?
>
>> Apparently Sybase and Microsoft SQL server use a slightly different
>> syntax, CROSS APPLY, for this.
>
>> http://iablog.sybase.com/paulley/2008/07/cross-and-outer-apply/
>
>> It may make sense to consider mimicking that instead of inventing our
>> own way of doing it, but I haven't investigated much so it's also
>> possible that it doesn't make sense.
>
> There are certain differences mentioned in the link due to which I am
> not sure it can be mimicked exactly, and may be that's why Sybase also has
> both syntaxes.
>
> Differences
> -----------------
> The LATERAL keyword cannot NULL-supply rows, whereas an OUTER APPLY will.
> Additionally, with a LATERAL derived table, the derived table and the outer
> reference must be separated by a comma. For an APPLY operator, the table
> expression on the right and the outer reference cannot be separated by a
> comma, but they can be separated by any other join operator. In other words,
> the APPLY operator allows references to any table within the left table
> expression, whereas the LATERAL keyword allows references to tables outside
> the current table expression.

I think you can always simulate CROSS APPLY using LATERAL.  The syntax
is different but the functionality is the same.  However, OUTER APPLY
allows you to do something that I don't think is possible using
LATERAL.  While it would be nice to have both CROSS APPLY and OUTER
APPLY, my main point was to suggest supporting CROSS APPLY rather than
the extension to the LATERAL syntax Tom proposed.  That is, the spec
allows:

FROM x,  LATERAL (SELECT * FROM srf(x.a)) y

...and Tom proposed allowing this to be shortened to:

FROM x, LATERAL srf(x.a)

...and what I'm saying is maybe we should instead allow it to be shortened to:

FROM x CROSS APPLY srf(x.a)

...as some other database systems are already doing.  I can't think of
any particular reason why Tom's proposed shorthand would be
problematic; I'm just suggesting that it may be better to support the
same shorthand that other people already support rather than inventing
our own, idiosyncratic shorthand.

That having been said, I get paid the same either way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Apparently Sybase and Microsoft SQL server use a slightly different
> syntax, CROSS APPLY, for this.

> http://iablog.sybase.com/paulley/2008/07/cross-and-outer-apply/

Well, this is only a blog entry and not their manual, but AFAICT that is
just a useless deviation from SQL-standard syntax; it does nothing that
"CROSS JOIN LATERAL" or "LEFT JOIN LATERAL ... ON true" wouldn't do.
I can't tell if the blogger simply doesn't know that LATERAL can be used
in a JOIN nest, or if that's actually a misfeature of the DBMS.

I'm not in favor of duplicating this.
        regards, tom lane


Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I think you can always simulate CROSS APPLY using LATERAL.  The syntax
> is different but the functionality is the same.  However, OUTER APPLY
> allows you to do something that I don't think is possible using
> LATERAL.

Uh, what exactly?  AFAICT from that blog entry, "x OUTER APPLY y" is
exactly the same as "x LEFT JOIN LATERAL y ON true".  Okay, so you
saved three words, but is that a good enough reason to invent a
nonstandard syntax?
        regards, tom lane


Re: WIP patch for LATERAL subqueries

From
Robert Haas
Date:
On Mon, Aug 6, 2012 at 11:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think you can always simulate CROSS APPLY using LATERAL.  The syntax
>> is different but the functionality is the same.  However, OUTER APPLY
>> allows you to do something that I don't think is possible using
>> LATERAL.
>
> Uh, what exactly?  AFAICT from that blog entry, "x OUTER APPLY y" is
> exactly the same as "x LEFT JOIN LATERAL y ON true".  Okay, so you
> saved three words, but is that a good enough reason to invent a
> nonstandard syntax?

I wasn't sure that x LEFT JOIN LATERAL (SELECT * FROM y) ON true is
valid syntax.  I thought that perhaps LATERAL() was only allowed
around a top-level FROM-list item.

However, if it is allowed, then I agree that the extra syntax isn't
adding any functionality; it's just a question of whether you happen
to like their particular choice of notational shorthand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Aug 6, 2012 at 11:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Uh, what exactly?  AFAICT from that blog entry, "x OUTER APPLY y" is
>> exactly the same as "x LEFT JOIN LATERAL y ON true".  Okay, so you
>> saved three words, but is that a good enough reason to invent a
>> nonstandard syntax?

> I wasn't sure that x LEFT JOIN LATERAL (SELECT * FROM y) ON true is
> valid syntax.  I thought that perhaps LATERAL() was only allowed
> around a top-level FROM-list item.

No.  LATERAL is allowed in a <table reference>, which can be either
a top-level FROM item or a component of a JOIN nest.  (My current
patch doesn't actually work for the latter case, but I'm going to
work on fixing that next.)  What's curious about that Sybase blog
is that the blogger seems to think that LATERAL can only be used
at top level ... but I'm not sure if that's actually a restriction
in Sybase, or just a gap in his knowledge.
        regards, tom lane


Re: WIP patch for LATERAL subqueries

From
Amit Kapila
Date:
From: Robert Haas [mailto:robertmhaas@gmail.com] 
Sent: Monday, August 06, 2012 8:07 PM
On Mon, Aug 6, 2012 at 10:07 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
> I think you can always simulate CROSS APPLY using LATERAL.  The syntax
> is different but the functionality is the same.  However, OUTER APPLY
> allows you to do something that I don't think is possible using
> LATERAL.  While it would be nice to have both CROSS APPLY and OUTER
> APPLY, my main point was to suggest supporting CROSS APPLY rather than
> the extension to the LATERAL syntax Tom proposed.  That is, the spec
> allows:

> FROM x,  LATERAL (SELECT * FROM srf(x.a)) y

I think in SQL specs it is not clearly mentioned about functions.
The same is mentioned by Tom in his mail
" So basicallyLATERAL func(args) <alias>  would be an allowed abbreviation forLATERAL (SELECT * FROM func(args))
<alias>Since the standard
 
doesn't have function-in-FROM, it has nothing to say about whether this is
sane or not."

> ...and Tom proposed allowing this to be shortened to:

> FROM x, LATERAL srf(x.a)

> ...and what I'm saying is maybe we should instead allow it to be shortened
to:

>FROM x CROSS APPLY srf(x.a)

>...as some other database systems are already doing.  

I think if specs doesn't mention clearly  about functions then we can use
LATERAL syntax similar to CROSS APPLY
which is proposed by Tom.

With Regards,
Amit Kapila.



Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
Here's an updated version of my LATERAL patch.

* Accepts "LATERAL func_name(args)".

* Handles LATERAL in JOIN nests now.  I rewrote the way
transformFromClause manages visibility of previously-parsed FROM items.
Rather than my previous idea of adding more namespace lists to a
ParseState, I changed p_relnamespace and p_varnamespace to be lists of
ParseNamespaceItem structs, which carry an RTE pointer plus visibility
flags.  This seemed to work out fairly well, and allowed me to implement
the RIGHT/FULL JOIN case exactly as per SQL spec, that is the left side
of a "RIGHT JOIN LATERAL" is visible to the right side but throws error
if used.  (I'm a bit tempted now to collapse p_relnamespace and
p_varnamespace into just one list, by adding two more booleans to
ParseNamespaceItem to show whether the item is visible for qualified or
unqualified references.  But that would affect code that the current
patch doesn't need to touch, so it seems better to postpone it to a
separate refactoring patch.)  This results in some changes in the error
messages output for improper-column-reference errors, as shown in the
regression test deltas.  It also fixes the pre-existing spec-conformance
issue about lateral versus parent references, as per Andrew Gierth in
http://archives.postgresql.org/message-id/87ocpjscpa.fsf@news-spur.riddles.org.uk

* Throws error if LATERAL is used to create an aggregate that belongs to
its aggregation query's FROM clause.  I did this in a slightly grotty
way: transformAggregateCall looks at the p_lateral_active field of the
appropriate pstate level, relying on the fact that the only way the
case can happen is via LATERAL.  As I mentioned earlier, I think it'd
be better to add a ParseState field showing exactly which query part
we're parsing at any instant.  However, again that seems better done as
part of a separate refactoring patch.

I have not done anything yet about the planner shortcomings.

What I'd like to do next, barring objections, is to band-aid the places
where the planner could crash on a LATERAL query (probably just make it
throw FEATURE_NOT_SUPPORTED errors), write some documentation, and
then commit what I've got.  After that I can go back to improve the
planner and work on the parser refactoring issues as separate patches.

Comments, better ideas?

            regards, tom lane


Attachment

Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
I wrote:
> What I'd like to do next, barring objections, is to band-aid the places
> where the planner could crash on a LATERAL query (probably just make it
> throw FEATURE_NOT_SUPPORTED errors), write some documentation, and
> then commit what I've got.  After that I can go back to improve the
> planner and work on the parser refactoring issues as separate patches.

... and done (though the pgsql-committers message seems to have got hung
up for moderation).  I put some simplistic examples into section 7.2.1.5
and the SELECT reference page ... if anybody has ideas for
more-compelling small examples, please speak up.
        regards, tom lane


Re: WIP patch for LATERAL subqueries

From
Merlin Moncure
Date:
On Tue, Aug 7, 2012 at 6:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> What I'd like to do next, barring objections, is to band-aid the places
>> where the planner could crash on a LATERAL query (probably just make it
>> throw FEATURE_NOT_SUPPORTED errors), write some documentation, and
>> then commit what I've got.  After that I can go back to improve the
>> planner and work on the parser refactoring issues as separate patches.
>
> ... and done (though the pgsql-committers message seems to have got hung
> up for moderation).  I put some simplistic examples into section 7.2.1.5
> and the SELECT reference page ... if anybody has ideas for
> more-compelling small examples, please speak up.

This is just awesome.  Anyways, I was looking around the docs for
references to the old methodology of select list SRF function calls.
This paragraph:
http://www.postgresql.org/docs/devel/static/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET

could probably use some enhancement describing best practices in a
LATERAL world and more examples of dealing with set returning
functions in general.  I also noticed that the build in SRF page
(http://www.postgresql.org/docs/devel/static/functions-srf.html) lies
with the comment "This section describes functions that possibly
return more than one row. Currently the only functions in this class
are series generating functions" since at minimum we have 'unnest' so
that page could use some wordsmithing as well.

merlin


Re: WIP patch for LATERAL subqueries

From
Tom Lane
Date:
Merlin Moncure <mmoncure@gmail.com> writes:
> This is just awesome.  Anyways, I was looking around the docs for
> references to the old methodology of select list SRF function calls.
> This paragraph:
> http://www.postgresql.org/docs/devel/static/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET

> could probably use some enhancement describing best practices in a
> LATERAL world and more examples of dealing with set returning
> functions in general.

Done, thanks for the suggestion.

> I also noticed that the build in SRF page
> (http://www.postgresql.org/docs/devel/static/functions-srf.html) lies
> with the comment "This section describes functions that possibly
> return more than one row. Currently the only functions in this class
> are series generating functions" since at minimum we have 'unnest' so
> that page could use some wordsmithing as well.

Yeah, there's also some regexp-related SRFs, as well as a boatload of
built-in SRFs that are mainly meant to underlie views.  I guess we could
try to force all of those into this page, but it doesn't really seem
like it'd be an improvement.  I took out the claim that these were all
such functions, instead.
        regards, tom lane