Thread: Do not scan index in right table if condition for left join evaluates to false using columns in left table

Hi, could you help to understand why Postgres scans index in right table in the following case:

CREATE TABLE parent (
  id integer PRIMARY KEY,
  dtype text
);

CREATE TABLE child (
  id integer PRIMARY KEY
);

INSERT INTO parent (id, dtype) values (1, 'A');
INSERT INTO child (id) values (1);

EXPLAIN ANALYZE
SELECT *
FROM parent p
LEFT JOIN child c
  ON p.id = c.id
  AND p.dtype = 'B'
WHERE p.id = 1;

Note that the only record in parent table has dtype == 'A', but the join condition has p.dtype = 'B'.
The query plan still shows Index Only Scan on child table with loops=1.

 Nested Loop Left Join  (cost=0.31..16.36 rows=1 width=40) (actual time=0.104..0.107 rows=1 loops=1)
   Join Filter: (p.dtype = 'B'::text)
   Rows Removed by Join Filter: 1
   ->  Index Scan using parent_pkey on parent p  (cost=0.15..8.17 rows=1 width=36) (actual time=0.018..0.019 rows=1 loops=1)
         Index Cond: (id = 1)
   ->  Index Only Scan using child_pkey on child c  (cost=0.15..8.17 rows=1 width=4) (actual time=0.078..0.080 rows=1 loops=1)
         Index Cond: (id = 1)
         Heap Fetches: 1

In comparison, if using INNER JOIN, Index Only Scan on child table is never executed.
Tested on PostgreSQL 17.2
Hi,

On 2024-12-07 21:30:46 +0300, Илья Жарков wrote:
> Hi, could you help to understand why Postgres scans index in right table in
> the following case:
>
> CREATE TABLE parent (
> >   id integer PRIMARY KEY,
> >   dtype text
> > );
>
>
> CREATE TABLE child (
> >   id integer PRIMARY KEY
> > );
>
>
> INSERT INTO parent (id, dtype) values (1, 'A');
> > INSERT INTO child (id) values (1);
>
>
> EXPLAIN ANALYZE
> > SELECT *
> > FROM parent p
> > LEFT JOIN child c
> >   ON p.id = c.id
> >   AND p.dtype = 'B'
> > WHERE p.id = 1;
>
>
> Note that the only record in *parent *table has dtype == 'A', but the join
> condition has p.dtype = 'B'.
> The query plan still shows Index Only Scan on *child *table with loops=1.

The relevant difference between the inner and left join is that for the inner
join we push down the p.dtype = 'B'::text condition down. However, we do *not*
do so for outer joins.

Outer:
┌───────────────────────────────────────────────────┐
│                    QUERY PLAN                     │
├───────────────────────────────────────────────────┤
│ Nested Loop Left Join                             │
│   Join Filter: (p.dtype = 'B'::text)              │
│   ->  Index Scan using parent_pkey on parent p    │
│         Index Cond: (id = 1)                      │
│   ->  Index Only Scan using child_pkey on child c │
│         Index Cond: (id = 1)                      │
└───────────────────────────────────────────────────┘

Inner:
┌───────────────────────────────────────────────────┐
│                    QUERY PLAN                     │
├───────────────────────────────────────────────────┤
│ Nested Loop                                       │
│   ->  Index Scan using parent_pkey on parent p    │
│         Index Cond: (id = 1)                      │
│         Filter: (dtype = 'B'::text)               │
│   ->  Index Only Scan using child_pkey on child c │
│         Index Cond: (id = 1)                      │
└───────────────────────────────────────────────────┘


We *do* have code that recognizes the case where a clause in a join's ON only
references the nullable side. We however don't have code that recognizes the
same if it's the non-nullable side.

That's somewhat surprising, but it does kinda make sense: A after all, in a
query like yours, you could just have had the p.dtype = 'B' in the WHERE list,
rather than inside the join's ON. The same isn't true for the nullable side of
the join, as a condition for te nullable side in the WHERE clause breaks the
join's "outerness".

I.e. you can write your query as
  SELECT * FROM parent p LEFT JOIN child c ON p.id = c.id WHERE p.id = 1 AND p.dtype = 'B';
in which case you get the expected query plan:

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                         QUERY PLAN
     │
 

├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Nested Loop Left Join  (cost=0.31..16.36 rows=1 width=40) (actual time=0.035..0.036 rows=0 loops=1)
     │
 
│   ->  Index Scan using parent_pkey on parent p  (cost=0.15..8.17 rows=1 width=36) (actual time=0.034..0.034 rows=0
loops=1)│
 
│         Index Cond: (id = 1)
     │
 
│         Filter: (dtype = 'B'::text)
     │
 
│         Rows Removed by Filter: 1
     │
 
│   ->  Index Only Scan using child_pkey on child c  (cost=0.15..8.17 rows=1 width=4) (never executed)
     │
 
│         Index Cond: (id = 1)
     │
 
│         Heap Fetches: 0
     │
 
│ Planning Time: 29.912 ms
     │
 
│ Execution Time: 0.095 ms
     │
 

└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


ISTM that it shouldn't be expensive to recognize this type of join clause and
pushes them down. While it could be done by the query's author, it seems worth
handling this on our side.  But maybe I'm missing something here?


Greetings,

Andres Freund

PS:

Here's the repro in an easier to copy way:

CREATE TABLE parent (id integer PRIMARY KEY,dtype text not null);
CREATE TABLE child (id integer PRIMARY KEY);
INSERT INTO parent (id, dtype) values (1, 'A');
INSERT INTO child (id) values (1);

EXPLAIN SELECT * FROM parent p LEFT JOIN child c ON p.id = c.id AND p.dtype = 'B' WHERE p.id = 1;
EXPLAIN SELECT * FROM parent p JOIN child c ON p.id = c.id AND p.dtype = 'B' WHERE p.id = 1;



Andres Freund <andres@anarazel.de> writes:
> ISTM that it shouldn't be expensive to recognize this type of join clause and
> pushes them down. While it could be done by the query's author, it seems worth
> handling this on our side.  But maybe I'm missing something here?

No, that condition *can't* be pushed down to the LHS scan, because its
failure should not remove LHS rows from the output; it can only cause
them to have nulls in the RHS columns.

One could imagine that we split up the join filter conditions into
"depends on RHS" and "doesn't depend on RHS" subsets, and make the
nestloop plan node evaluate the latter set only once per LHS row,
and then skip the inner-side scan when that condition fails.
But this would be a bunch of new mechanism that's only useful for
outer joins, only for rather hokey outer join conditions, and only
for nestloop-type joins.  I'm pretty dubious that it's worth the
trouble -- not least because I don't recall anybody complaining
about this before.

            regards, tom lane



On 2024-12-07 16:37:32 -0500, Andres Freund wrote:
> On 2024-12-07 21:30:46 +0300, Илья Жарков wrote:
> > Note that the only record in *parent *table has dtype == 'A', but the join
> > condition has p.dtype = 'B'.
> > The query plan still shows Index Only Scan on *child *table with loops=1.
>
> The relevant difference between the inner and left join is that for the inner
> join we push down the p.dtype = 'B'::text condition down. However, we do *not*
> do so for outer joins.
>
> Outer:
> ┌───────────────────────────────────────────────────┐
> │                    QUERY PLAN                     │
> ├───────────────────────────────────────────────────┤
> │ Nested Loop Left Join                             │
> │   Join Filter: (p.dtype = 'B'::text)              │
> │   ->  Index Scan using parent_pkey on parent p    │
> │         Index Cond: (id = 1)                      │
> │   ->  Index Only Scan using child_pkey on child c │
> │         Index Cond: (id = 1)                      │
> └───────────────────────────────────────────────────┘
>
> Inner:
> ┌───────────────────────────────────────────────────┐
> │                    QUERY PLAN                     │
> ├───────────────────────────────────────────────────┤
> │ Nested Loop                                       │
> │   ->  Index Scan using parent_pkey on parent p    │
> │         Index Cond: (id = 1)                      │
> │         Filter: (dtype = 'B'::text)               │
> │   ->  Index Only Scan using child_pkey on child c │
> │         Index Cond: (id = 1)                      │
> └───────────────────────────────────────────────────┘
>
>
> We *do* have code that recognizes the case where a clause in a join's ON only
> references the nullable side. We however don't have code that recognizes the
> same if it's the non-nullable side.
>
> That's somewhat surprising, but it does kinda make sense: A after all, in a
> query like yours, you could just have had the p.dtype = 'B' in the WHERE list,
> rather than inside the join's ON. The same isn't true for the nullable side of
> the join, as a condition for te nullable side in the WHERE clause breaks the
> join's "outerness".
>
> I.e. you can write your query as
>   SELECT * FROM parent p LEFT JOIN child c ON p.id = c.id WHERE p.id = 1 AND p.dtype = 'B';
> in which case you get the expected query plan:
>
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
> │                                                         QUERY PLAN
       │
 
>
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
> │ Nested Loop Left Join  (cost=0.31..16.36 rows=1 width=40) (actual time=0.035..0.036 rows=0 loops=1)
       │
 
> │   ->  Index Scan using parent_pkey on parent p  (cost=0.15..8.17 rows=1 width=36) (actual time=0.034..0.034 rows=0
loops=1)│
 
> │         Index Cond: (id = 1)
       │
 
> │         Filter: (dtype = 'B'::text)
       │
 
> │         Rows Removed by Filter: 1
       │
 
> │   ->  Index Only Scan using child_pkey on child c  (cost=0.15..8.17 rows=1 width=4) (never executed)
       │
 
> │         Index Cond: (id = 1)
       │
 
> │         Heap Fetches: 0
       │
 
> │ Planning Time: 29.912 ms
       │
 
> │ Execution Time: 0.095 ms
       │
 
>
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Ah, wait - that's not actually equivalent. When p.dtype = 'B' inthe ON clause,
we still produce an output row, but not if in the WHERE clause.

So:

> ISTM that it shouldn't be expensive to recognize this type of join clause and
> pushes them down. While it could be done by the query's author, it seems worth
> handling this on our side.  But maybe I'm missing something here?

yes, I was missing something, it would not be a valid transformation.


I guess it's still somewhat silly that we do an index scan for the inner side,
even though we could know that we'll always fail to the joinqual.  We could
evaluate the qual after fetching the outer row, before fetching the matching
inner row.

It's might not be worth adding code to handle such cases to the the nested
loop code, this is probably not that common a query pattern. If we don't want
to have explict execution time code paths, we could emit a "constant
qualification" Result node above the inner side of a parametrized nested loop?

Greetings,

Andres Freund




I've just realized that I replied not to the whole mailing list. This is a duplicate of mail sent to Andres.

Thank you for the quick response.

Some background using code and concepts of particular languages and frameworks:

Exactly such queries are automatically built by Hibernate if using polymorphic queries with fetching associations of sub-classes using function treat. https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#hql-function-treat

Reproducer:

@Entity
@Inheritance
@DiscriminatorValue("A")
class Parent {
    @Id
    Integer id;
}

@Entity
@DiscriminatorValue("B")
class ParentB extends Parent {
    @OneToOne(mappedBy = "parent")
    Child child;
}

@Entity
class Child {
    @Id
    Integer id;

    @OneToOne
    @MapsId
    @JoinColumn(name = "id")
    private ParentB parent;
}

List<Parent> resultList = session.createQuery(
        """
        from Parent p
        left join fetch treat(p as ParentB).child
        where p.id = :id
        """, Parent.class)
        .setParameter("id", 1)
        .getResultList();

вс, 8 дек. 2024 г. в 01:21, Andres Freund <andres@anarazel.de>:
On 2024-12-07 16:37:32 -0500, Andres Freund wrote:
> On 2024-12-07 21:30:46 +0300, Илья Жарков wrote:
> > Note that the only record in *parent *table has dtype == 'A', but the join
> > condition has p.dtype = 'B'.
> > The query plan still shows Index Only Scan on *child *table with loops=1.
>
> The relevant difference between the inner and left join is that for the inner
> join we push down the p.dtype = 'B'::text condition down. However, we do *not*
> do so for outer joins.
>
> Outer:
> ┌───────────────────────────────────────────────────┐
> │                    QUERY PLAN                     │
> ├───────────────────────────────────────────────────┤
> │ Nested Loop Left Join                             │
> │   Join Filter: (p.dtype = 'B'::text)              │
> │   ->  Index Scan using parent_pkey on parent p    │
> │         Index Cond: (id = 1)                      │
> │   ->  Index Only Scan using child_pkey on child c │
> │         Index Cond: (id = 1)                      │
> └───────────────────────────────────────────────────┘
>
> Inner:
> ┌───────────────────────────────────────────────────┐
> │                    QUERY PLAN                     │
> ├───────────────────────────────────────────────────┤
> │ Nested Loop                                       │
> │   ->  Index Scan using parent_pkey on parent p    │
> │         Index Cond: (id = 1)                      │
> │         Filter: (dtype = 'B'::text)               │
> │   ->  Index Only Scan using child_pkey on child c │
> │         Index Cond: (id = 1)                      │
> └───────────────────────────────────────────────────┘
>
>
> We *do* have code that recognizes the case where a clause in a join's ON only
> references the nullable side. We however don't have code that recognizes the
> same if it's the non-nullable side.
>
> That's somewhat surprising, but it does kinda make sense: A after all, in a
> query like yours, you could just have had the p.dtype = 'B' in the WHERE list,
> rather than inside the join's ON. The same isn't true for the nullable side of
> the join, as a condition for te nullable side in the WHERE clause breaks the
> join's "outerness".
>
> I.e. you can write your query as
>   SELECT * FROM parent p LEFT JOIN child c ON p.id = c.id WHERE p.id = 1 AND p.dtype = 'B';
> in which case you get the expected query plan:
> ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
> │                                                         QUERY PLAN                                                          │
> ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
> │ Nested Loop Left Join  (cost=0.31..16.36 rows=1 width=40) (actual time=0.035..0.036 rows=0 loops=1)                         │
> │   ->  Index Scan using parent_pkey on parent p  (cost=0.15..8.17 rows=1 width=36) (actual time=0.034..0.034 rows=0 loops=1) │
> │         Index Cond: (id = 1)                                                                                                │
> │         Filter: (dtype = 'B'::text)                                                                                         │
> │         Rows Removed by Filter: 1                                                                                           │
> │   ->  Index Only Scan using child_pkey on child c  (cost=0.15..8.17 rows=1 width=4) (never executed)                        │
> │         Index Cond: (id = 1)                                                                                                │
> │         Heap Fetches: 0                                                                                                     │
> │ Planning Time: 29.912 ms                                                                                                    │
> │ Execution Time: 0.095 ms                                                                                                    │
> └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Ah, wait - that's not actually equivalent. When p.dtype = 'B' inthe ON clause,
we still produce an output row, but not if in the WHERE clause.

So:

> ISTM that it shouldn't be expensive to recognize this type of join clause and
> pushes them down. While it could be done by the query's author, it seems worth
> handling this on our side.  But maybe I'm missing something here?

yes, I was missing something, it would not be a valid transformation.


I guess it's still somewhat silly that we do an index scan for the inner side,
even though we could know that we'll always fail to the joinqual.  We could
evaluate the qual after fetching the outer row, before fetching the matching
inner row.

It's might not be worth adding code to handle such cases to the the nested
loop code, this is probably not that common a query pattern. If we don't want
to have explict execution time code paths, we could emit a "constant
qualification" Result node above the inner side of a parametrized nested loop?

Greetings,

Andres Freund
Hi,

On 2024-12-07 17:06:52 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > ISTM that it shouldn't be expensive to recognize this type of join clause and
> > pushes them down. While it could be done by the query's author, it seems worth
> > handling this on our side.  But maybe I'm missing something here?
>
> No, that condition *can't* be pushed down to the LHS scan, because its
> failure should not remove LHS rows from the output; it can only cause
> them to have nulls in the RHS columns.

Yea, just sent an email saying so after realizing the same.


> One could imagine that we split up the join filter conditions into
> "depends on RHS" and "doesn't depend on RHS" subsets, and make the
> nestloop plan node evaluate the latter set only once per LHS row,
> and then skip the inner-side scan when that condition fails.
> But this would be a bunch of new mechanism that's only useful for
> outer joins, only for rather hokey outer join conditions, and only
> for nestloop-type joins.  I'm pretty dubious that it's worth the
> trouble -- not least because I don't recall anybody complaining
> about this before.

As I wrote in my other email, I'm also somewhat dubious it's worth having
explicit code for this in nodeNestloop.c.

Not convinced that such queries are that hokey though, it's not that rare a
thing to want to left join to some other table only if the outer side matches
some attribute. We ourselves do have a few cases like that in our queries,
e.g. psql's describeOneTableDetails(), pg_dump's getTables() and several
others.  ...


Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2024-12-07 17:06:52 -0500, Tom Lane wrote:
>> One could imagine that we split up the join filter conditions into
>> "depends on RHS" and "doesn't depend on RHS" subsets, and make the
>> nestloop plan node evaluate the latter set only once per LHS row,
>> and then skip the inner-side scan when that condition fails.

> As I wrote in my other email, I'm also somewhat dubious it's worth having
> explicit code for this in nodeNestloop.c.

Yeah.  Your idea of pushing the "doesn't depend on RHS" subset into
a one-time Result filter atop the RHS is interesting though.  Then we
don't need any new executor machinery, but we pay for that with a more
complex planner patch.  Not sure how hard that would be.

            regards, tom lane



Hi,

On 2024-12-08 09:23:42 +0700, Andrei Lepikhov wrote:
> To determine how difficult it could be, I wrote a prototype (very raw),
> which shows how it works in action and how many subsystems we have to touch.

Cool!


> Also, I wonder why you think it could work with NestLoop only.

It doesn't seem to apply to mergejoins. For hashjoins, it doesn't seem worth a
separate evaluation of a expression - something which has noticeable overhead
on its own - given that the cost of a lookup in the hashtable isn't
particularly high. Compare that to the nestloop case where you always need an
indexscan and often will have more complicated subtrees being executed
unnecessarily on the inner side.


> I think avoiding touching a hash table and an index under MergeJoin can also
> be beneficial.

How would you get significant wins for mergejoins? You need to go through both
inner and outer anyway?

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2024-12-08 09:23:42 +0700, Andrei Lepikhov wrote:
>> I think avoiding touching a hash table and an index under MergeJoin can also
>> be beneficial.

> How would you get significant wins for mergejoins? You need to go through both
> inner and outer anyway?

Yeah, sounds like nonsense to me too.  The reason this can be a win
for nestloop is that we perform a new scan of the inner relation
for each outer row, and if we can skip an inner scan altogether
then we're ahead.  But mergejoin and hashjoin only scan each input
once anyway, so I see no opportunity to save any scan work.

It's barely possible that this is interesting for hash join
because you could save scanning a hash list ... but frankly
I don't believe there could be enough win there to justify
additional mechanism.  Hash join is in a world of hurt already
if there are lots of duplicate hash codes.

            regards, tom lane



On 8/12/2024 09:52, Andres Freund wrote:
>> I think avoiding touching a hash table and an index under MergeJoin can also
>> be beneficial.
> 
> How would you get significant wins for mergejoins? You need to go through both
> inner and outer anyway?
In my mind, this trick can be designed for specific cases like sales 
tables, as illustrated before and used by well-rounded developers. I'm 
not sure that such optimisation would be profitable in general. My point 
is that the sales database has lots of categories, and when requesting 
product descriptions, we will not necessarily touch all the categories - 
in that case, the one-sided clause could allow us to avoid scanning some 
tables at all. Am I wrong?
BTW, may it be used in SEMI JOIN cases?

-- 
regards, Andrei Lepikhov




Regarding merge joins, I suppose in some edge cases inner set scan might not even be started.

FROM parent p
LEFT JOIN child c
  ON p.id = c.id
  AND p.dtype = 'B'

        ┌───┐ ┌───┐
parent  │1,A│ │2,A│
        └───┘ └───┘
       ^    
        ┌───┐ ┌───┐
child   │ 1 │ │ 2 │
        └───┘ └───┘
       ^

If p.dtype = 'B' was evaluated early, the pointer could move through the outer set as long as it is evaluated to false.
In the above example, it reaches the end without even accessing the inner set.

        ┌───┐ ┌───┐
parent  │1,A│ │2,A│
        └───┘ └───┘
                   ^
        ┌───┐ ┌───┐
child   │ 1 │ │ 2 │
        └───┘ └───┘
       ^

In the opposite scenario:

        ┌───┐ ┌───┐
parent  │1,B│ │2,A│
        └───┘ └───┘
       ^    
        ┌───┐ ┌───┐
child   │ 1 │ │ 2 │
        └───┘ └───┘
       ^

it would need to start moving the inner pointer at some point.

        ┌───┐ ┌───┐
parent  │1,B│ │2,A│
        └───┘ └───┘
                   ^
        ┌───┐ ┌───┐
child   │ 1 │ │ 2 │
        └───┘ └───┘
             ^    

But even in this case, the pointer may not reach the end in the inner set.

Though this highly depends on how merge join is implemented in the Postgres code. I have to admit that I have a very vague idea on this...

вс, 8 дек. 2024 г. в 11:44, Andrei Lepikhov <lepihov@gmail.com>:
On 8/12/2024 09:52, Andres Freund wrote:
>> I think avoiding touching a hash table and an index under MergeJoin can also
>> be beneficial.
>
> How would you get significant wins for mergejoins? You need to go through both
> inner and outer anyway?
In my mind, this trick can be designed for specific cases like sales
tables, as illustrated before and used by well-rounded developers. I'm
not sure that such optimisation would be profitable in general. My point
is that the sales database has lots of categories, and when requesting
product descriptions, we will not necessarily touch all the categories -
in that case, the one-sided clause could allow us to avoid scanning some
tables at all. Am I wrong?
BTW, may it be used in SEMI JOIN cases?

--
regards, Andrei Lepikhov

On 2024-12-08 15:44:23 +0700, Andrei Lepikhov wrote:
> On 8/12/2024 09:52, Andres Freund wrote:
> > > I think avoiding touching a hash table and an index under MergeJoin can also
> > > be beneficial.
> > 
> > How would you get significant wins for mergejoins? You need to go through both
> > inner and outer anyway?

> In my mind, this trick can be designed for specific cases like sales tables,
> as illustrated before and used by well-rounded developers.

I'm not saying that the optimization isn't useful, just that i don't think it
makes much sense for mergeoins.

Besides, at least as far as I can see, fundamentally not optimizing mergejoins
in any substantial manner, it also just doesn't seem likely that queries where
this optimization would come up are likely to be planned as mergejoins. If you
have a leftjoin to a category-style table, you're very rarely going to scan
the "main" table with an ordered index scan on the category key, which would
be required for a merge join. And sorting the main table once for each
to-be-joined-category-table isn't a desirable path most of the time either.

I don't know what it means that it's to be "used by well-rounded
developers". We have to write the implementation in way it works regardless of
what kind of developer is using postgres.


> I'm not sure that such optimisation would be profitable in general.

Are you suggesting it would only be enabled by a GUC?


> My point is that the sales database has lots of categories, and when
> requesting product descriptions, we will not necessarily touch all the
> categories - in that case, the one-sided clause could allow us to avoid
> scanning some tables at all. Am I wrong?

No, I don't think you're wrong. I just don't think it's worthwhile for
mergejoins, because I don't see relevant query patterns where it would
help.


> BTW, may it be used in SEMI JOIN cases?

Seems possible.

Greetings,

Andres Freund



On 8/12/2024 23:37, Andres Freund wrote:
> On 2024-12-08 15:44:23 +0700, Andrei Lepikhov wrote:
>> On 8/12/2024 09:52, Andres Freund wrote:
>>>> I think avoiding touching a hash table and an index under MergeJoin can also
>>>> be beneficial.
>>>
>>> How would you get significant wins for mergejoins? You need to go through both
>>> inner and outer anyway?
> 
>> In my mind, this trick can be designed for specific cases like sales tables,
>> as illustrated before and used by well-rounded developers.
> 
> I'm not saying that the optimization isn't useful, just that i don't think it
> makes much sense for mergeoins.
> 
> Besides, at least as far as I can see, fundamentally not optimizing mergejoins
> in any substantial manner, it also just doesn't seem likely that queries where
> this optimization would come up are likely to be planned as mergejoins. If you
> have a leftjoin to a category-style table, you're very rarely going to scan
> the "main" table with an ordered index scan on the category key, which would
> be required for a merge join. And sorting the main table once for each
> to-be-joined-category-table isn't a desirable path most of the time either.
> 
> I don't know what it means that it's to be "used by well-rounded
> developers". We have to write the implementation in way it works regardless of
> what kind of developer is using postgres.
I got it, thanks. I agree that the profit for MJ & HJ cases looks 
modest. When designing the prototype, I realised that it seemed more 
"symmetrical" and logical to separate such 'one-sided' clauses during 
the planning for all types of join.
But I don't insist on this statement. It would be OK even in the case of 
NestLoop (remember, we should change the NL cost model too).

>> I'm not sure that such optimisation would be profitable in general.
> 
> Are you suggesting it would only be enabled by a GUC?
No, such a feature probably won't generate much overhead; I don't see 
any reason to cover it under a GUC.

-- 
regards, Andrei Lepikhov