Re: Bad plan after vacuum analyze - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: Bad plan after vacuum analyze |
Date | |
Msg-id | 17017.1115843555@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Bad plan after vacuum analyze (Guillaume Smet <guillaume_ml@smet.org>) |
Responses |
Re: Bad plan after vacuum analyze
|
List | pgsql-performance |
Ah-ha, I can replicate the problem. This example uses tenk1 from the regression database, which has a column unique2 containing just the integers 0..9999. regression=# create table t1(f1 int); CREATE TABLE regression=# insert into t1 values(5); INSERT 154632 1 regression=# insert into t1 values(7); INSERT 154633 1 regression=# analyze t1; ANALYZE regression=# explain analyze select * from tenk1 right join t1 on (unique2=f1); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Merge Right Join (cost=1.03..1.37 rows=2 width=248) (actual time=0.507..0.617 rows=2 loops=1) Merge Cond: ("outer".unique2 = "inner".f1) -> Index Scan using tenk1_unique2 on tenk1 (cost=0.00..498.24 rows=10024 width=244) (actual time=0.126..0.242 rows=9loops=1) -> Sort (cost=1.03..1.03 rows=2 width=4) (actual time=0.145..0.153 rows=2 loops=1) Sort Key: t1.f1 -> Seq Scan on t1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.029..0.049 rows=2 loops=1) Total runtime: 1.497 ms (7 rows) The planner correctly perceives that only a small part of the unique2 index will need to be scanned, and hence thinks the merge is cheap --- much cheaper than if the whole index had to be scanned. And it is. Notice that only 9 rows were actually pulled from the index. Once we got to unique2 = 8, nodeMergejoin.c could see that no more matches to f1 were possible. But watch this: regression=# insert into t1 values(null); INSERT 154634 1 regression=# explain analyze select * from tenk1 right join t1 on (unique2=f1); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Merge Right Join (cost=1.03..1.37 rows=2 width=248) (actual time=0.560..290.874 rows=3 loops=1) Merge Cond: ("outer".unique2 = "inner".f1) -> Index Scan using tenk1_unique2 on tenk1 (cost=0.00..498.24 rows=10024 width=244) (actual time=0.139..106.982 rows=10000loops=1) -> Sort (cost=1.03..1.03 rows=2 width=4) (actual time=0.181..0.194 rows=3 loops=1) Sort Key: t1.f1 -> Seq Scan on t1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.032..0.067 rows=3 loops=1) Total runtime: 291.670 ms (7 rows) See what happened to the actual costs of the indexscan? All of a sudden we had to scan the whole index because there was a null in the other input, and nulls sort high. I wonder if it is worth fixing nodeMergejoin.c to not even try to match nulls to the other input. We'd have to add a check to see if the join operator is strict or not, but it nearly always will be. The alternative would be to make the planner only believe in the short-circuit path occuring if it thinks that the other input is entirely non-null ... but this seems pretty fragile, since it only takes one null to mess things up, and ANALYZE can hardly be counted on to detect one null in a table. In the meantime it seems like the quickest answer for Guillaume might be to try to avoid keeping any NULLs in parent_application_id. regards, tom lane
pgsql-performance by date: