Thread: [HACKERS] Quals not pushed down into lateral

[HACKERS] Quals not pushed down into lateral

From
Andres Freund
Date:
Hi,

During citus development we noticed that restrictions aren't pushed down
into lateral subqueries, even if they semantically could.  For example,
in this dumbed down example:

postgres[31776][1]=# CREATE TABLE t_2(id serial primary key);
postgres[31776][1]=# CREATE TABLE t_1(id serial primary key);

Comparing:

postgres[31776][1]=# EXPLAIN SELECT * FROM t_1 JOIN (SELECT * FROM t_2 GROUP BY id) s ON (t_1.id = s.id) WHERE t_1.id =
3;
┌─────────────────────────────────────────────────────────────────────────────────────┐
│                                     QUERY PLAN                                      │
├─────────────────────────────────────────────────────────────────────────────────────┤
│ Nested Loop  (cost=0.31..16.37 rows=1 width=8)                                      │
│   ->  Index Only Scan using t_1_pkey on t_1  (cost=0.15..8.17 rows=1 width=4)       │
│         Index Cond: (id = 3)                                                        │
│   ->  Group  (cost=0.15..8.17 rows=1 width=4)                                       │
│         Group Key: t_2.id                                                           │
│         ->  Index Only Scan using t_2_pkey on t_2  (cost=0.15..8.17 rows=1 width=4) │
│               Index Cond: (id = 3)                                                  │
└─────────────────────────────────────────────────────────────────────────────────────┘
(7 rows)

with:

postgres[31776][1]=# EXPLAIN SELECT * FROM t_1, LATERAL (SELECT * FROM t_2 WHERE t_1.id = t_2.id GROUP BY id) s WHERE
t_1.id= 3;
 
┌─────────────────────────────────────────────────────────────────────────────────────┐
│                                     QUERY PLAN                                      │
├─────────────────────────────────────────────────────────────────────────────────────┤
│ Nested Loop  (cost=0.31..16.37 rows=1 width=8)                                      │
│   ->  Index Only Scan using t_1_pkey on t_1  (cost=0.15..8.17 rows=1 width=4)       │
│         Index Cond: (id = 3)                                                        │
│   ->  Group  (cost=0.15..8.17 rows=1 width=4)                                       │
│         Group Key: t_2.id                                                           │
│         ->  Index Only Scan using t_2_pkey on t_2  (cost=0.15..8.17 rows=1 width=4) │
│               Index Cond: (id = t_1.id)                                             │
└─────────────────────────────────────────────────────────────────────────────────────┘

it's noticeable that the former has id = 3 pushed down into both
relations index scans, whereas the latter doesn't.


This seems like a worthwhile future optimization opportunity.


I've not looked into this in any detail, but the proximate source is
that set_subquery_pathlist() doesn't see any baserstrictinfos to push
down.  Which makes sense, because t_1.id = t_2.id isn't "visible" (in
the sense of deconstruct_jointree dealing with it) to the outside.

It seems possible to look into rel->lateral_vars, check whether that's
member of some equivclass, and then push the relevant equivalences down
(after taking care that the Var from the outside is known as a Param on
the inside).


I'm not planning to work on this anytime soon, but I thought it'd be
useful to have a searchable reference point about the topic.  If
somebody wants to work on it...

Greetings,

Andres Freund



Re: [HACKERS] Quals not pushed down into lateral

From
Robert Haas
Date:
On Thu, Mar 16, 2017 at 4:45 AM, Andres Freund <andres@anarazel.de> wrote:
> During citus development we noticed that restrictions aren't pushed down
> into lateral subqueries, even if they semantically could.  For example,
> in this dumbed down example:
>
> postgres[31776][1]=# CREATE TABLE t_2(id serial primary key);
> postgres[31776][1]=# CREATE TABLE t_1(id serial primary key);
>
> Comparing:
>
> postgres[31776][1]=# EXPLAIN SELECT * FROM t_1 JOIN (SELECT * FROM t_2 GROUP BY id) s ON (t_1.id = s.id) WHERE t_1.id
=3;
 
> postgres[31776][1]=# EXPLAIN SELECT * FROM t_1, LATERAL (SELECT * FROM t_2 WHERE t_1.id = t_2.id GROUP BY id) s WHERE
t_1.id= 3;
 

Interesting.  That does seem like we are missing a trick.

Not exactly related, but I think we need to improve optimization
around CTEs, too.  AFAICT, what we've got right now, almost everybody
hates.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Quals not pushed down into lateral

From
Andres Freund
Date:
On 2017-04-13 16:34:12 -0400, Robert Haas wrote:
> On Thu, Mar 16, 2017 at 4:45 AM, Andres Freund <andres@anarazel.de> wrote:
> > During citus development we noticed that restrictions aren't pushed down
> > into lateral subqueries, even if they semantically could.  For example,
> > in this dumbed down example:
> >
> > postgres[31776][1]=# CREATE TABLE t_2(id serial primary key);
> > postgres[31776][1]=# CREATE TABLE t_1(id serial primary key);
> >
> > Comparing:
> >
> > postgres[31776][1]=# EXPLAIN SELECT * FROM t_1 JOIN (SELECT * FROM t_2 GROUP BY id) s ON (t_1.id = s.id) WHERE
t_1.id= 3;
 
> > postgres[31776][1]=# EXPLAIN SELECT * FROM t_1, LATERAL (SELECT * FROM t_2 WHERE t_1.id = t_2.id GROUP BY id) s
WHEREt_1.id = 3;
 
> 
> Interesting.  That does seem like we are missing a trick.

Yea.

> Not exactly related, but I think we need to improve optimization
> around CTEs, too.  AFAICT, what we've got right now, almost everybody
> hates.

That's certainly an issue, but it's a lot harder to resolve because
we've, for years, told people to intentionally use CTEs as optimization
barriers :(

- Andres



Re: [HACKERS] Quals not pushed down into lateral

From
David Fetter
Date:
On Thu, Apr 13, 2017 at 01:39:07PM -0700, Andres Freund wrote:
> On 2017-04-13 16:34:12 -0400, Robert Haas wrote:
> > On Thu, Mar 16, 2017 at 4:45 AM, Andres Freund <andres@anarazel.de> wrote:
> > > During citus development we noticed that restrictions aren't pushed down
> > > into lateral subqueries, even if they semantically could.  For example,
> > > in this dumbed down example:
> > >
> > > postgres[31776][1]=# CREATE TABLE t_2(id serial primary key);
> > > postgres[31776][1]=# CREATE TABLE t_1(id serial primary key);
> > >
> > > Comparing:
> > >
> > > postgres[31776][1]=# EXPLAIN SELECT * FROM t_1 JOIN (SELECT * FROM t_2 GROUP BY id) s ON (t_1.id = s.id) WHERE
t_1.id= 3;
 
> > > postgres[31776][1]=# EXPLAIN SELECT * FROM t_1, LATERAL (SELECT * FROM t_2 WHERE t_1.id = t_2.id GROUP BY id) s
WHEREt_1.id = 3;
 
> > 
> > Interesting.  That does seem like we are missing a trick.
> 
> Yea.
> 
> > Not exactly related, but I think we need to improve optimization
> > around CTEs, too.  AFAICT, what we've got right now, almost everybody
> > hates.
> 
> That's certainly an issue, but it's a lot harder to resolve because
> we've, for years, told people to intentionally use CTEs as optimization
> barriers :(

If we can get better performance by removing the barriers, we can
certainly explain the new hack, assuming there is or needs to be
one, in the release notes.  We haven't promised to keep the current
behavior forever, nor should we.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate