Performance issue with nested loop - Mailing list pgsql-performance
From | Jens Reufsteck |
---|---|
Subject | Performance issue with nested loop |
Date | |
Msg-id | 009501c7ea25$92d60ea0$8c01a8c0@hobsons.loc Whole thread Raw |
Responses |
Re: [GENERAL] Performance issue with nested loop
|
List | pgsql-performance |
Hi all, I'm having a strange performance issue with two almost similar queries, the one running as expected, the other one taking far more time. The only difference is that I have "uniid in (10)" in the normally running query and "uniid in (9,10)" in the other one. The number of rows resulting from the respective table differs not very much being 406 for the first and 511 for the second query. This is the full query - the "uniid in (9,10)" is in the last subquery: SELECT 'Anzahl' AS column1, count(DISTINCT sid) AS column2 FROM ( SELECT sid FROM stud WHERE stud.status > 0 AND length(stud.vname) > 1 AND length(stud.nname) > 1 ) AS qur_filter_1 INNER JOIN ( SELECT DISTINCT sid FROM stud_vera INNER JOIN phon USING (sid) WHERE veraid = 22 AND stud_vera.status > 0 AND ( ( veraid IN (2, 3, 22, 24, 36) AND phontyp = 5 AND phon.typ = 1 AND phon.status > 0 ) OR ( veraid NOT IN (2, 3, 22, 24, 36) ) ) ) AS qur_filter_2 USING (sid) INNER JOIN ( SELECT DISTINCT sid FROM ausb INNER JOIN uni USING (uniid) WHERE uni.uniort IN ('Augsburg') AND ausb.overview = 1 AND ausb.zweitstudium != 2 AND ausb.status > 0 ) AS qur_filter_3 USING (sid) INNER JOIN ( SELECT DISTINCT sid FROM ausb WHERE uniid IN (9, 10) AND ausb.overview = 1 AND ausb.zweitstudium != 2 AND ausb.status > 0 ) AS qur_filter_4 USING (sid) These are the query-plans for both queries, first the problematic one: Aggregate (cost=78785.78..78785.79 rows=1 width=4) (actual time=698777.890..698777.891 rows=1 loops=1) -> Nested Loop (cost=65462.58..78785.78 rows=1 width=4) (actual time=6743.856..698776.957 rows=250 loops=1) Join Filter: ("outer".sid = "inner".sid) -> Merge Join (cost=11031.79..11883.12 rows=1 width=12) (actual time=387.837..433.612 rows=494 loops=1) Merge Cond: ("outer".sid = "inner".sid) -> Nested Loop (cost=5643.11..6490.17 rows=19 width=8) (actual time=114.323..154.043 rows=494 loops=1) -> Unique (cost=5643.11..5645.35 rows=180 width=4) (actual time=114.202..116.002 rows=511 loops=1) -> Sort (cost=5643.11..5644.23 rows=448 width=4) (actual time=114.199..114.717 rows=511 loops=1) Sort Key: public.ausb.sid -> Seq Scan on ausb (cost=0.00..5623.38 rows=448 width=4) (actual time=0.351..112.459 rows=511 loops=1) Filter: (((uniid = 9) OR (uniid = 10)) AND (overview = 1) AND (zweitstudium <> 2) AND (status > 0)) -> Index Scan using stud_pkey on stud (cost=0.00..4.67 rows=1 width=4) (actual time=0.062..0.067 rows=1 loops=511) Index Cond: (stud.sid = "outer".sid) Filter: ((status > 0) AND (length((vname)::text) > 1) AND (length((nname)::text) > 1)) -> Materialize (cost=5388.68..5392.05 rows=337 width=4) (actual time=273.506..276.785 rows=511 loops=1) -> Unique (cost=5383.29..5384.98 rows=337 width=4) (actual time=273.501..275.421 rows=511 loops=1) -> Sort (cost=5383.29..5384.13 rows=337 width=4) (actual time=273.499..274.091 rows=511 loops=1) Sort Key: public.ausb.sid -> Hash Join (cost=17.61..5369.14 rows=337 width=4) (actual time=1.139..272.465 rows=511 loops=1) Hash Cond: ("outer".uniid = "inner".uniid) -> Seq Scan on ausb (cost=0.00..4827.30 rows=104174 width=8) (actual time=0.026..200.111 rows=103593 loops=1) Filter: ((overview = 1) AND (zweitstudium <> 2) AND (status > 0)) -> Hash (cost=17.60..17.60 rows=2 width=4) (actual time=0.435..0.435 rows=2 loops=1) -> Seq Scan on uni (cost=0.00..17.60 rows=2 width=4) (actual time=0.412..0.424 rows=2 loops=1) Filter: ((uniort)::text = 'Augsburg'::text) -> Unique (cost=54430.79..66664.18 rows=10599 width=4) (actual time=6.851..1374.135 rows=40230 loops=494) -> Merge Join (cost=54430.79..66319.65 rows=137811 width=4) (actual time=6.849..1282.333 rows=40233 loops=494) Merge Cond: ("outer".sid = "inner".sid) Join Filter: (((("outer".veraid = 2) OR ("outer".veraid = 3) OR ("outer".veraid = 22) OR ("outer".veraid = 24) OR ("outer".veraid = 36)) AND ("inner".phontyp = 5) AND ("inner".typ = 1) AND ("inner".status > 0)) OR (("outer".veraid <> 2) AND ("outer".veraid <> 3) AND ("outer".veraid <> 22) AND ("outer".veraid <> 24) AND ("outer".veraid <> 36))) -> Sort (cost=11962.11..12098.59 rows=54593 width=8) (actual time=0.547..46.482 rows=53354 loops=494) Sort Key: stud_vera.sid -> Bitmap Heap Scan on stud_vera (cost=2239.14..7666.61 rows=54593 width=8) (actual time=43.096..165.300 rows=53354 loops=1) Recheck Cond: (veraid = 22) Filter: (status > 0) -> Bitmap Index Scan on stud_vera_sid_veraid_idx (cost=0.00..2239.14 rows=58765 width=0) (actual time=41.242..41.242 rows=61855 loops=1) Index Cond: (veraid = 22) -> Sort (cost=42468.68..43407.53 rows=375539 width=12) (actual time=6.297..533.711 rows=375539 loops=494) Sort Key: phon.sid -> Seq Scan on phon (cost=0.00..7696.39 rows=375539 width=12) (actual time=0.048..544.999 rows=375539 loops=1) !!!! The query-plan for the normally running query: !!!!! Aggregate (cost=77846.97..77846.98 rows=1 width=4) (actual time=5488.913..5488.913 rows=1 loops=1) -> Nested Loop (cost=65471.70..77846.97 rows=1 width=4) (actual time=3913.839..5488.513 rows=208 loops=1) Join Filter: ("outer".sid = "inner".sid) -> Merge Join (cost=60088.41..72454.41 rows=1 width=12) (actual time=3598.105..4841.242 rows=208 loops=1) Merge Cond: ("outer".sid = "inner".sid) -> Unique (cost=54430.79..66664.18 rows=10599 width=4) (actual time=3479.029..4697.051 rows=40129 loops=1) -> Merge Join (cost=54430.79..66319.65 rows=137811 width=4) (actual time=3479.027..4616.245 rows=40132 loops=1) Merge Cond: ("outer".sid = "inner".sid) Join Filter: (((("outer".veraid = 2) OR ("outer".veraid = 3) OR ("outer".veraid = 22) OR ("outer".veraid = 24) OR ("outer".veraid = 36)) AND ("inner".phontyp = 5) AND ("inner".typ = 1) AND ("inner".status > 0)) OR (("outer".veraid <> 2) AND ("outer".veraid <> 3) AND ("outer".veraid <> 22) AND ("outer".veraid <> 24) AND ("outer".veraid <> 36))) -> Sort (cost=11962.11..12098.59 rows=54593 width=8) (actual time=274.248..315.052 rows=53252 loops=1) Sort Key: stud_vera.sid -> Bitmap Heap Scan on stud_vera (cost=2239.14..7666.61 rows=54593 width=8) (actual time=46.669..167.599 rows=53352 loops=1) Recheck Cond: (veraid = 22) Filter: (status > 0) -> Bitmap Index Scan on stud_vera_sid_veraid_idx (cost=0.00..2239.14 rows=58765 width=0) (actual time=44.618..44.618 rows=61857 loops=1) Index Cond: (veraid = 22) -> Sort (cost=42468.68..43407.53 rows=375539 width=12) (actual time=3204.729..3681.598 rows=375090 loops=1) Sort Key: phon.sid -> Seq Scan on phon (cost=0.00..7696.39 rows=375539 width=12) (actual time=0.052..628.838 rows=375539 loops=1) -> Materialize (cost=5657.62..5657.71 rows=9 width=8) (actual time=91.290..105.557 rows=406 loops=1) -> Nested Loop (cost=5234.08..5657.61 rows=9 width=8) (actual time=91.282..104.571 rows=406 loops=1) -> Unique (cost=5234.08..5235.20 rows=90 width=4) (actual time=91.156..92.232 rows=420 loops=1) -> Sort (cost=5234.08..5234.64 rows=224 width=4) (actual time=91.154..91.484 rows=420 loops=1) Sort Key: public.ausb.sid -> Seq Scan on ausb (cost=0.00..5225.34 rows=224 width=4) (actual time=0.266..90.242 rows=420 loops=1) Filter: ((uniid = 10) AND (overview = 1) AND (zweitstudium <> 2) AND (status > 0)) -> Index Scan using stud_pkey on stud (cost=0.00..4.67 rows=1 width=4) (actual time=0.024..0.026 rows=1 loops=420) Index Cond: (stud.sid = "outer".sid) Filter: ((status > 0) AND (length((vname)::text) > 1) AND (length((nname)::text) > 1)) -> Unique (cost=5383.29..5384.98 rows=337 width=4) (actual time=1.520..2.686 rows=511 loops=208) -> Sort (cost=5383.29..5384.13 rows=337 width=4) (actual time=1.519..1.871 rows=511 loops=208) Sort Key: public.ausb.sid -> Hash Join (cost=17.61..5369.14 rows=337 width=4) (actual time=1.133..314.584 rows=511 loops=1) Hash Cond: ("outer".uniid = "inner".uniid) -> Seq Scan on ausb (cost=0.00..4827.30 rows=104174 width=8) (actual time=0.030..226.532 rows=103593 loops=1) Filter: ((overview = 1) AND (zweitstudium <> 2) AND (status > 0)) -> Hash (cost=17.60..17.60 rows=2 width=4) (actual time=0.392..0.392 rows=2 loops=1) -> Seq Scan on uni (cost=0.00..17.60 rows=2 width=4) (actual time=0.369..0.381 rows=2 loops=1) Filter: ((uniort)::text = 'Augsburg'::text) The estimated row numbers are not bad as long as one table is affected. They're much worse as soon as two or more tables are joined. Though the query plans are slightly different, the number of merged rows at different stages seems to be rather the same for both plans. The big difference in my eyes seems the cost for the first nested loop. This seems to be the point, where the long running query consumes most time. I've then set enable_nestloop to off, and actually the problem disappears. Other maybe relevant parameters: default_statistics_target = 100 work_mem = 4096 max_fsm_pages = 100000 My questions: What could be the problem behind high amount of actually used time for the nested loop in the first query? If we decided to constantly turn off nested loops, what side effects would we have to expect? Are there more granular ways to tell the query planner when to use nested loops? Or just other ideas what to do? We'd be grateful for any hint! Many thanks Jens -- Jens Reufsteck Hobsons GmbH Wildunger Straße 6 60487 Frankfurt am Main Deutschland Tel: +49 (69) 255 37-140 Fax: +49 (69) 255 37-2140 http://www.hobsons.de http://www.hobsons.ch Geschäftsführung: Christopher Letcher, Judith Oppitz, Adam Webster Amtsgericht Frankfurt HRB 58610
pgsql-performance by date: