Re: Parameterized-path cost comparisons need some work - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Parameterized-path cost comparisons need some work |
Date | |
Msg-id | 22521.1334797983@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Parameterized-path cost comparisons need some work (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
So while testing this patch I've found out that there is a pretty nasty bug in HEAD as well as in my current formulation of the patch. Here is an example using the regression database: select count(*) from (tenk1 a join tenk1 b on a.unique1=b.unique2) left join tenk1 c on a.unique2 = b.unique1 and c.thousand= a.thousand join int4_tbl on b.thousand = f1; The correct answer to this query is 10, according to all previous PG branches, but HEAD is reporting zero. A look at the query plan soon finds the smoking gun: Aggregate (cost=264.29..264.30 rows=1 width=0) -> Nested Loop Left Join (cost=0.00..264.16 rows=50 width=0) JoinFilter: (a.unique2 = b.unique1) -> Nested Loop (cost=0.00..234.21 rows=50 width=12) Join Filter:(b.unique2 = a.unique1) -> Nested Loop (cost=0.00..211.85 rows=50 width=8) -> SeqScan on int4_tbl (cost=0.00..1.05 rows=5 width=4) -> Index Scan using tenk1_thous_tenthous on tenk1b (cost=0.00..42.04 rows=10 width=12) Index Cond: (thousand = int4_tbl.f1) -> Index Scan using tenk1_unique2 on tenk1 a (cost=0.00..0.43 rows=1 width=12) Index Cond: (unique2 =b.unique1) -> Index Only Scan using tenk1_thous_tenthous on tenk1 c (cost=0.00..0.45 rows=10 width=4) Index Cond: (thousand = a.thousand) The condition a.unique2 = b.unique1 is getting pushed down into the a/b join, where it should not go; applying it there causes a/b join rows to be discarded when they ought to survive and then be null-extended at the left join. What this demonstrates is that the rule for identifying safe movable clauses that's used in HEAD is not good enough. It rejects clauses that reference relations that are on the inside of a left join relative to the target relation --- but the problematic clause doesn't actually reference c, so that doesn't trigger. The issue would go away if we examined required_relids instead of clause_relids, but that causes other, perfectly safe, optimizations to be discarded. I believe that the correct formulation of the join clause movement rule is "outer join clauses can't be pushed into the left-hand side of their outer join". However, the present representation of clauses doesn't provide enough information to test this cheaply during scan planning (indeed maybe it can't be done at all after distribute_qual_to_rels, because we don't retain any explicit memory of which clauses are above or below which outer joins). Looking at nullable_relids isn't the right thing because what that tells you about is what's on the right-hand side of the outer join, not the left side. So I'm coming to the conclusion that to get this right, we need to add another relids field to RestrictInfo and have it filled in during distribute_qual_to_rels. This is really probably going to end up cheaper than what we have today, because the join clause movement tests are somewhat expensive as the code stands, and it should be possible to reduce the number of operations needed there. What I have in mind is that the new field would be named something like outer_left_relids, and would list the relids of all base rels that are in the left-hand side of the outer join that the clause belongs to (or be NULL if the clause isn't an outer-join clause). This is cheaply constructable during distribute_qual_to_rels, we just are not bothering to record the information at present. Then the join clause movement checks can easily detect that it would be unsafe to push down such a clause to any of the listed relations. regards, tom lane
pgsql-hackers by date: