Re: Consider parallel for lateral subqueries with limit - Mailing list pgsql-hackers

From James Coleman
Subject Re: Consider parallel for lateral subqueries with limit
Date
Msg-id CAAaqYe8A7=rCYMuiTuiscHw53qM2nR3T2_0i6C9c3Uxb5Lrt8Q@mail.gmail.com
Whole thread Raw
In response to Re: Consider parallel for lateral subqueries with limit  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Thu, Sep 22, 2022 at 5:19 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 4:29 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Mon, Sep 19, 2022 at 3:58 PM James Coleman <jtc331@gmail.com> wrote:
> > > But in the case where there's correlation via LATERAL we already don't
> > > guarantee unique executions for a given set of params into the lateral
> > > subquery execution, right? For example, suppose we have:
> > >
> > >   select *
> > >   from foo
> > >   left join lateral (
> > >     select n
> > >     from bar
> > >     where bar.a = foo.a
> > >     limit 1
> > >   ) on true
> > >
> > > and suppose that foo.a (in the table scan) returns these values in
> > > order: 1, 2, 1. In that case we'll execute the lateral subquery 3
> > > separate times rather than attempting to order the results of foo.a
> > > such that we can re-execute the subquery only when the param changes
> > > to a new unique value (and we definitely don't cache the results to
> > > guarantee unique subquery executions).
> >
> > I think this is true, but I don't really understand why we should
> > focus on LATERAL here. What we really need, and I feel like we've
> > talked about this before, is a way to reason about where parameters
> > are set and used.
>
> Yes, over in the thread "Parallelize correlated subqueries that
> execute within each worker" [1] which was meant to build on top of
> this work (though is technically separate). Your bringing it up here
> too is making me wonder if we can combine the two and instead of
> always allowing subqueries with LIMIT instead (like the other patch
> does) delay final determination of parallel safety of rels and paths
> (perhaps, as that thread is discussing, until gather/gather merge path
> creation).

Upon further investigation and thought I believe it *might* be
possible to do what I'd wondered about above (delay final
determination of parallel safety of the rel until later on in planning
by marking e.g. rels as tentatively safe and re-evaluating that as we
go) as my original patch did on the referenced thread, but that thread
also ended up with a much simpler proposed approach that still moved
final parallel safety determination to later in the planner, but it
did it by checking (in generate_gather_paths and
generate_user_gather_paths) whether that point in the plan tree
supplies the params required by the partial path.

So the current approach in that other thread is orthogonal to (if
complementary in some queries) the current question, which is "must we
immediately disallow parallelism on a rel that has a limit?"

Tom's concern about my arguments about special casing lateral was:

> This argument seems to be assuming that Y is laterally dependent on X,
> but the patch as written will take *any* lateral dependency as a
> get-out-of-jail-free card.  If what we have is "Limit(Y sub Z)" where
> Z is somewhere else in the query tree, it's not apparent to me that
> your argument holds.

I do see now that there was a now obvious flaw in the original patch:
rte->lateral may well be set to true even in cases where there's no
actual lateral dependency. That being said I don't see a need to
determine if the current subtree provides the required lateral
dependency, for the following reasons:

1. We don't want to set rel->consider_parallel to false immediately
because that will then poison everything higher in the tree, despite
the fact that it may well be that it's only higher up the plan tree
that the lateral dependency is provided. Regardless of the level in
the plan tree at which the param is provided we are going to execute
the subquery (even in serial execution) once per outer tuple at the
point in the join tree where the lateral join lies.
2. We're *not* at this point actually checking parallel safety of a
given path (i.e., is the path parallel safe at this point given the
params currently provided), we're only checking to see if the rel
itself should be entirely excluded from consideration for parallel
plans at any point in the future.
3. We already know that the question of whether or not a param is
provided can't be the concern here since it isn't under consideration
in the existing code path. That is, if a subquery doesn't have a limit
then this particular check won't determine that the subquery's
existence should result in setting rel->consider_parallel to false
because of any params it may or may not contain.
4. It is already the case that a subplan using exec params in the
where clause will not be considered parallel safe in path generation.

I believe the proper check for when to exclude subqueries with limit
clauses is (as in the attached patch) prohibiting a limit when
rel->lateral_relids is empty. Here are several examples of queries and
plans and how this code plays out to attempt to validate that
hypothesis:

 select *
  from foo
  left join lateral (
    select n
    from bar
    where bar.a = foo.a
    limit 1
  ) on true;

 Nested Loop Left Join  (cost=0.00..8928.05 rows=2550 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Limit  (cost=0.00..3.48 rows=1 width=4)
         ->  Seq Scan on bar  (cost=0.00..38.25 rows=11 width=4)
               Filter: (a = foo.a)

This was my base case for the past few emails. It hits the modified
code path with rte->lateral = true and
bms_num_members(rel->lateral_relids) = 1. The patch would allow the
subquery rel.consider_parallel to be set to true, however with solely
this patch we still won't put the limit subquery within the
parallelized portion of the plan because of the exec param used in the
where clause.

 select *
  from foo
  left join lateral (
    select foo.a
    from bar
    limit 1
  ) on true;

 Nested Loop Left Join  (cost=0.00..97.78 rows=2550 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Limit  (cost=0.00..0.01 rows=1 width=4)
         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

In this case the lateral reference is only in the target list of the
subquery, and this form is where the patch kicks in to allow placing
the gather node above the whole join tree (thus executing the limit
subquery within each worker).

select *, (select n from bar where bar.a = foo.a limit 1) from foo;

 Seq Scan on foo  (cost=0.00..8902.55 rows=2550 width=8)
   SubPlan 1
     ->  Limit  (cost=0.00..3.48 rows=1 width=4)
           ->  Seq Scan on bar  (cost=0.00..38.25 rows=11 width=4)
                 Filter: (a = foo.a)

Robert wondered if this was effectively the same thing, and it
definitely, in my opinion, ought to be the same--in terms of the
results you expect--as my original example. However this example
doesn't appear to hit this code path at all. We also already
parallelize this form of query (both when the outer tuple reference is
in the subquery's target list and when it's in the subquery's where
clause).

 select *
  from foo
  left join lateral (
    select n
    from bar
    where bar.a = foo.a
  ) on true;

 Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
   Merge Cond: (bar.a = foo.a)
   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
         Sort Key: bar.a
         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
   ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
         Sort Key: foo.a
         ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)

Removing the limit results in the correlated subquery being rewritten
as a join. Because this rewrite removes the correlation (at least in
execution) this query also doesn't hit the modified code path at all.
I do find this one interesting because it's one of these examples of
how we already provide different guarantees around correlated subquery
execution (even when executing serially): when there's a limit here
the subquery executes multiple times (which means, for example, that
the results of the scan may be returned in a different order) but
without the limit it executes a single time (so the order is
guaranteed).

 select *
  from foo
  left join lateral (
    select n
    from bar
  ) on true;

 Nested Loop Left Join  (cost=0.00..140795.50 rows=5763000 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

If we remove the correlation but retain the lateral keyword we also
don't hit this code path. By the way, the plan is also the same if you
remove the useless lateral keyword in this query.

 select *
  from foo
  where foo.a in (
    select bar.a from bar limit 1
 );

 Hash Semi Join  (cost=0.03..42.37 rows=13 width=4)
   Hash Cond: (foo.a = bar.a)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Hash  (cost=0.01..0.01 rows=1 width=4)
         ->  Limit  (cost=0.00..0.01 rows=1 width=4)
               ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

This is the query form Tom referenced from a bug report that
originally brought in the current code that excludes all subqueries
with limits from parallelization. This form does indeed hit the code
block in question, but rte->lateral is false and
bms_num_members(rel->lateral_relids) is 0 so it is unaffected by this
patch.

 select *
 from foo
 left join (
   select n
   from bar
   limit 1
 ) on true;

 Nested Loop Left Join  (cost=0.00..67.39 rows=2550 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Materialize  (cost=0.00..0.02 rows=1 width=4)
         ->  Limit  (cost=0.00..0.01 rows=1 width=4)
               ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

This query also has rte->lateral is false and
bms_num_members(rel->lateral_relids) is 0 when reaching the line of
code changed in this patch. The interesting thing about this query is
that if you set enable_material = off the materialize node goes away,
and we do not attempt to rewrite the query into something that would
execute the subquery only once, so this is a case where we already
don't provide the theoretically strongest possible guarantees, though
one could reasonably argue it's a bit artificial with materialization
turned off.

In summary we already allow parallelizing this type of execution
pattern if the subquery is in the select clause; we apply stricter
standards to all subqueries in from clauses. I believe the prohibition
on parallelizing subqueries with limits in from clauses was
unnecessarily restrictive when it was added. When we have lateral
dependencies we execute the query in the same was we would when the
subquery is in the target list (i.e., per tuple at that point in the
join tree), and so we should be able to parallelize those subqueries
in the from clause just like we already do when they are in the target
list.

In the attached series the first patch adds a bunch of new tests to
show a bunch of permutations of queries. Most of the added test
queries don't end up changing with the 2nd patch applied (the actual
code changes)  so that you can easily see the narrow scope of what's
affected. I don't envision most of the tests sticking around if this
is committed, but hopefully it provides a helpful way to evaluate the
semantics of the change I'm proposing.

Thanks,
James Coleman

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [RFC] building postgres with meson - v13
Next
From: James Coleman
Date:
Subject: Re: Allow foreign keys to reference a superset of unique columns