Re: Patch to support SEMI and ANTI join removal - Mailing list pgsql-hackers

From David Rowley
Subject Re: Patch to support SEMI and ANTI join removal
Date
Msg-id CAApHDvo+uZuPUSRFmJYMNEe5=70eEro5=HuX=tciZRfTRJ5C1A@mail.gmail.com
Whole thread Raw
In response to Re: Patch to support SEMI and ANTI join removal  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Patch to support SEMI and ANTI join removal
List pgsql-hackers
On Tue, Oct 7, 2014 at 3:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 6, 2014 at 5:57 AM, David Rowley <dgrowleyml@gmail.com> wrote:
> Can anyone shed any light on how I might determine where the scan rel is in
> the tree? I need to find it so I can check if the RangeTblEntry is marked as
> skip-able.

I think you're probably going to need logic that knows about
particular node types and does the right thing for each one.  I think
- but maybe I'm misunderstanding - what you're looking for is a
function of the form Oid ScansOnePlainTableWithoutQuals().  The
algorithm could be something like:

switch (node type)
{
    case T_SeqScanState:
        if (no quals)
            return the_appropriate_oid;
        return false;
    case T_HashJoin:
        decide whether we can ignore one side of the join
        fish out the node from the other side of the join (including
reaching through the Hash node if necessary)
        return ScansOnePlainTableWithoutQuals(the node we fished out);
    ...other specific cases...
    default:
        return false;
}

Thanks Robert.

Ok, so I've been hacking away at this for a couple of evenings and I think I have a working prototype finally!
My earlier thoughts about having to descend down until I find a seqscan were wrong. It looks like just need to look at the next node down, if it's a seqscan and it's marked as not needed, then we can skip that side of the join, or if the child node is a HashJoinState then check the skip status of that node, if both sides are marked as not needed, then skip that side of the join.

I've just completed some simple performance tests:

create table t3 (id int primary key);
create table t2 (id int primary key, t3_id int not null references t3);
create table t1 (id int primary key, t2_id int not null references t2);
  
I've loaded these tables with 4 million rows each.The performance is as follows:

test=# select count(*) from t1 inner join t2 on t1.t2_id=t2.id inner join t3 on t2.t3_id=t3.id;
  count
---------
 4000000
(1 row)
Time: 1022.492 ms

test=# select count(*) from t1;
  count
---------
 4000000
(1 row)
Time: 693.642 ms

test=# alter table t2 drop constraint t2_t3_id_fkey;
ALTER TABLE
Time: 2.141 ms
test=# alter table t1 drop constraint t1_t2_id_fkey;
ALTER TABLE
Time: 1.678 ms
test=# select count(*) from t1 inner join t2 on t1.t2_id=t2.id inner join t3 on t2.t3_id=t3.id;
  count
---------
 4000000
(1 row)
Time: 11682.525 ms

So it seems it's not quite as efficient as join removal at planning time, but still a big win when it's possible to perform the join skipping.

As yet, I've only added support for hash joins, and I've not really looked into detail on what's needed for nested loop joins or merge joins. 

For hash join I just added some code into the case HJ_BUILD_HASHTABLE: in ExecHashJoin(). The code just checks if any side can be skipped, if they can then the node will never move out of the HJ_BUILD_HASHTABLE state, so that each time ExecHashJoin() is called, it'll just return the next tuple from the non-skip side of the join until we run out of tuples... Or if both sides can be skipped NULL is returned as any nodes above this shouldn't attempt to scan, perhaps this should just be an Assert() as I don't think any parent nodes should ever bother executing that node if it's not required. The fact that I've put this code into the switch under HJ_BUILD_HASHTABLE makes me think it should add about close to zero overhead for when the join cannot be skipped.

One thing that we've lost out of this execution time join removal checks method is the ability to still remove joins where the join column is NULLable. The previous patch added IS NOT NULL to ensure the query was equivalent when a join was removed, of course this meant that any subsequent joins may later not have been removed due to the IS NOT NULL quals existing (which could restrict the rows and remove the ability that the foreign key could guarantee the existence of exactly 1 row matching the join condition). I've not yet come up with a nice way to reimplement these null checks at execution time, so I've thought that perhaps I won't bother, at least not for this patch. I'd just disable join skipping at planning time if any of the join columns can have NULLs.

Anyway this is just a progress report, and also to say thanks to Andres, you might just have saved this patch with the execution time checking idea.

I'll post a patch soon, hopefully once I have merge and nestloop join types working.

Regards

David Rowley

pgsql-hackers by date:

Previous
From: Marti Raudsepp
Date:
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Next
From: Andres Freund
Date:
Subject: Re: Patch to support SEMI and ANTI join removal