Thread: left join performance problem
Hi, I am using pg 7.3.1 on a dual r.h. 7.3 box. I have a big problem with pg left join performance. My plan is: =# explain analyze select D.IDS AS DIDS ,D.IDS_SKLAD, D.IDS_KO AS DIDSKO,KL.MNAME AS KLNAME, D.NOMER AS DNOMER,D.DATE_OP, S.MED AS MEDNAME, NOM.MNAME AS NOMNAME,S.IDS_NUM, S.KOL, S.CENA,S.VAL,S.TOT,S.DTO,S.PTO ,M.OTN AS MOTN FROM A_KLIENTI KL , A_NOMEN NOM, A_DOC D,A_SKLA D S left outer join A_MESKLAD M ON(S.IDS=M.IDS) WHERE D.OP=4 AND D.IDS=S.IDS_DOC AND D.IDS_KO=KL.IDS AND S.IDS_NUM=NOM.IDS AND KL.IDS_G RUPA = 'SOF_112' ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=460869.55..470785.29 rows=20 width=1034) (actual time=50139.27..57565.34 rows=12990 loops=1) -> Hash Join (cost=460869.55..470662.48 rows=20 width=862) (actual time=50139.02..57246.35 rows=12990 loops=1) Hash Cond: ("outer".ids_doc = "inner".ids) -> Merge Join (cost=457324.89..463038.60 rows=815792 width=356) (actual time=48128.32..53430.02 rows=815926 loops=1) Merge Cond: ("outer".ids = "inner".ids) -> Index Scan using a_mesklad_pkey on a_mesklad m (cost=0.00..1395.47 rows=15952 width=72) (actual time=0.21..109.19 rows=15952 loops=1) -> Sort (cost=457324.89..459364.37 rows=815792 width=284) (actual time=48128.05..49380.06 rows=815926 loops=1) Sort Key: s.ids -> Seq Scan on a_sklad s (cost=0.00..74502.92 rows=815792 width=284) (actual time=4.32..16777.16 rows=815926 loops=1) -> Hash (cost=3544.65..3544.65 rows=3 width=506) (actual time=1104.34..1104.34 rows=0 loops=1) -> Hash Join (cost=905.35..3544.65 rows=3 width=506) (actual time=428.32..1098.52 rows=1966 loops=1) Hash Cond: ("outer".ids_ko = "inner".ids) -> Index Scan using i_doc_op on a_doc d (cost=0.00..2625.71 rows=677 width=244) (actual time=29.27..690.86 rows=1981 loops=1) Index Cond: (op = 4) -> Hash (cost=905.19..905.19 rows=65 width=262) (actual time=398.97..398.97 rows=0 loops=1) -> Seq Scan on a_klienti kl (cost=0.00..905.19 rows=65 width=262) (actual time=396.68..398.93 rows=7 loops=1) Filter: (ids_grupa = 'SOF_112'::name) -> Index Scan using a_nomen_pkey on a_nomen nom (cost=0.00..6.01 rows=1 width=172) (actual time=0.01..0.02 rows=1 loops=12990) Index Cond: ("outer".ids_num = nom.ids) Total runtime: 57749.24 msec (20 rows) If I remove the join ( I know it is not very correct and I receive 19 rows as answer) it is working very fast. The plan is: explain analyze select D.IDS AS DIDS ,D.IDS_SKLAD, D.IDS_KO AS DIDSKO,KL.MNAME AS KLNAME, D.NOMER AS DNOMER,D.DATE_OP, S.MED AS MEDNAME, NOM.MNAME AS NOMNAME,S.IDS_NUM, S.KOL, S.CENA,S.VAL,S.TOT,S.DTO,S.PTO ,M.OTN AS MOTN FROM A_KLIENTI KL , A_NOMEN NOM, A_DOC D,A_SKLAD S ,A_MESKLAD M WHERE S.IDS=M.IDS AND D.OP=4 AND D.IDS=S.IDS_DOC AND D.IDS_KO=KL.IDS AND S.IDS_NUM=NOM.IDS AND D.NOMER like '%0905' AND KL.IDS_GRUPA = 'SOF_112' ORDER BY D.IDS,S.IDS_NUM,S.ORDER_NUM ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=18897.33..18897.33 rows=1 width=1038) (actual time=36.33..36.35 rows=48 loops=1) Sort Key: d.ids, s.ids_num, s.order_num -> Nested Loop (cost=0.00..18897.32 rows=1 width=1038) (actual time=30.90..35.93 rows=48 loops=1) -> Nested Loop (cost=0.00..18891.29 rows=1 width=866) (actual time=30.70..33.34 rows=48 loops=1) -> Nested Loop (cost=0.00..18885.28 rows=1 width=794) (actual time=30.44..31.98 rows=48 loops=1) -> Nested Loop (cost=0.00..2633.93 rows=1 width=506) (actual time=30.18..30.62 rows=1 loops=1) -> Index Scan using i_doc_op on a_doc d (cost=0.00..2627.40 rows=1 width=244) (actual time=29.93..30.36 rows=1 loops=1) Index Cond: (op = 4) Filter: (nomer ~~ '%0905'::text) -> Index Scan using a_klienti_pkey on a_klienti kl (cost=0.00..6.01 rows=1 width=262) (actual time=0.23..0.23 rows=1 loops=1) Index Cond: ("outer".ids_ko = kl.ids) Filter: (ids_grupa = 'SOF_112'::name) -> Index Scan using i_sklad_ids_doc on a_sklad s (cost=0.00..16200.36 rows=4079 width=288) (actual time=0.24..0.95 rows=48 loops=1) Index Cond: ("outer".ids = s.ids_doc) -> Index Scan using a_mesklad_pkey on a_mesklad m (cost=0.00..6.01 rows=1 width=72) (actual time=0.02..0.02 rows=1 loops=48) Index Cond: ("outer".ids = m.ids) -> Index Scan using a_nomen_pkey on a_nomen nom (cost=0.00..6.01 rows=1 width=172) (actual time=0.04..0.04 rows=1 loops=48) Index Cond: ("outer".ids_num = nom.ids) Total runtime: 36.98 msec (19 rows) Also S.IDS and M.IDS are name and primary key's. I can not find my problem. Any idea will help. Of cours I can make the query with two selects and will work fast, but I think it is not good solution. regards, ivan.
pginfo <pginfo@t1.unisoftbg.com> writes: > I have a big problem with pg left join performance. I think the problem is that the LEFT JOIN clause is forcing the planner to join A_SKLAD to A_MESKLAD before anything else, whereas a good plan would do some of the other joins first to eliminate as many rows as possible. You will need to revise the query to let the LEFT JOIN happen later. For discussion see http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=explicit-joins.html regards, tom lane
Many thanks Tom,
the doc do not contain solution for this case, but the idea to
change the join order was excelent and all is working fine at the moment.
the doc do not contain solution for this case, but the idea to
change the join order was excelent and all is working fine at the moment.
regards,
ivan.
Tom Lane wrote:
pginfo <pginfo@t1.unisoftbg.com> writes:
> I have a big problem with pg left join performance.I think the problem is that the LEFT JOIN clause is forcing the
planner to join A_SKLAD to A_MESKLAD before anything else, whereas
a good plan would do some of the other joins first to eliminate
as many rows as possible. You will need to revise the query to
let the LEFT JOIN happen later. For discussion see
http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=explicit-joins.htmlregards, tom lane
On Thu, Jun 12, 2003 at 06:48:27AM +0200, pginfo wrote: > Many thanks Tom, > the doc do not contain solution for this case, but the idea to > change the join order was excelent and all is working fine at the moment. Any chance of getting a TODO added that would provide the option of having the optimizer pick join order when you're using the ANSI join syntax? IMHO I think it's bad that using the ANSI syntax forces join order; it would be much better to come up with a custom syntax for this like everyone else does. But I'm sure people won't want to change the existing behavior, so special syntax to do the opposite is almost as good. -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
"Jim C. Nasby" <jim@nasby.net> writes: > Any chance of getting a TODO added that would provide the option of > having the optimizer pick join order when you're using the ANSI join > syntax? No ... because it's already DONE. regards, tom lane
On Mon, Jun 16, 2003 at 01:28:26AM -0400, Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: > > Any chance of getting a TODO added that would provide the option of > > having the optimizer pick join order when you're using the ANSI join > > syntax? > > No ... because it's already DONE. DOH, I forgot about the subselect trick. Nevermind. *wipes egg off face* -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Mon, Jun 16, 2003 at 00:36:29 -0500, "Jim C. Nasby" <jim@nasby.net> wrote: > On Mon, Jun 16, 2003 at 01:28:26AM -0400, Tom Lane wrote: > > "Jim C. Nasby" <jim@nasby.net> writes: > > > Any chance of getting a TODO added that would provide the option of > > > having the optimizer pick join order when you're using the ANSI join > > > syntax? > > > > No ... because it's already DONE. > > DOH, I forgot about the subselect trick. Nevermind. In 7.4 there is a GUC setting to control this. I believe the default is to not constrain the join order any more than is necessary to preserve semantics.
"Jim C. Nasby" <jim@nasby.net> writes: > On Mon, Jun 16, 2003 at 01:28:26AM -0400, Tom Lane wrote: >> "Jim C. Nasby" <jim@nasby.net> writes: > Any chance of getting a TODO added that would provide the option of > having the optimizer pick join order when you're using the ANSI join > syntax? >> >> No ... because it's already DONE. > DOH, I forgot about the subselect trick. Nevermind. No, I wasn't talking about that. See http://developer.postgresql.org/docs/postgres/explicit-joins.html for the way it works in CVS tip. regards, tom lane
On Mon, Jun 16, 2003 at 09:26:13AM -0400, Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: > > On Mon, Jun 16, 2003 at 01:28:26AM -0400, Tom Lane wrote: > >> "Jim C. Nasby" <jim@nasby.net> writes: > > Any chance of getting a TODO added that would provide the option of > > having the optimizer pick join order when you're using the ANSI join > > syntax? > >> > >> No ... because it's already DONE. > > > DOH, I forgot about the subselect trick. Nevermind. > > No, I wasn't talking about that. See > http://developer.postgresql.org/docs/postgres/explicit-joins.html > for the way it works in CVS tip. Ahh, cool. BTW, I think it should be prominently mentioned in http://developer.postgresql.org/docs/postgres/sql-select.html#SQL-FROM that ANSI join syntax can force join order, since most DBA's unfamiliar with pgsql probably won't be expecting that. -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"