Thread: 8.4/9.0 simple query performance regression
All, Just got this simple case off IRC today: 8.4.4 This plan completes in 100ms: old_prod=# explain analyze select email from u_contact where id not in (select contact_id from u_user); QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Seq Scan on u_contact (cost=2217.72..4759.74 rows=35560 width=22) (actual time=61.283..107.169 rows=4521 loops=1) Filter: (NOT (hashed SubPlan 1)) SubPlan 1 -> Seq Scan on u_user (cost=0.00..2051.38 rows=66538 width=8) (actual time=0.034..33.303 rows=66978 loops=1) Total runtime: 108.001 ms 9.0.2 This plan does not complete in 15 minutes or more: new_prod=# explain select email from u_contact where id not in (select contact_id from u_user); QUERY PLAN --------------------------------------------------------------------------- Seq Scan on u_contact (cost=0.00..100542356.74 rows=36878 width=22) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Materialize (cost=0.00..2552.56 rows=69504 width=8) -> Seq Scan on u_user (cost=0.00..1933.04 rows=69504 width=8) (5 rows) I'm at a bit of a loss as to what's happening here. I'd guess another failure of a bail-out-early plan, but I can't see how that would work with this query. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > Just got this simple case off IRC today: > [ hashed versus non-hashed subplan ] > I'm at a bit of a loss as to what's happening here. Possibly work_mem is smaller in the second installation? (If I'm counting on my fingers right, you'd need a setting of at least a couple MB to let it choose a hashed subplan for this case.) regards, tom lane
07.06.11 00:45, Josh Berkus написав(ла): > All, > > Just got this simple case off IRC today: > > 8.4.4 > This plan completes in 100ms: > Filter: (NOT (hashed SubPlan 1)) > 9.0.2 > This plan does not complete in 15 minutes or more: > Filter: (NOT (SubPlan 1)) "Hashed" is the key. Hashed subplans usually has much better performance. You need to increase work_mem. I suppose it is in default state as you need not too much memory for hash of 70K integer values. BTW: Why do it want to materialize a result of seq scan without filter. I can see no benefits (or is it more narrow rows?) Best regards, Vitalii Tymchyshyn