Re: Re: lost records --- problem identified! - Mailing list pgsql-hackers
From | Tatsuo Ishii |
---|---|
Subject | Re: Re: lost records --- problem identified! |
Date | |
Msg-id | 20000822215722O.t-ishii@sra.co.jp Whole thread Raw |
In response to | Re: lost records --- problem identified! (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Re: lost records --- problem identified!
|
List | pgsql-hackers |
> Ah, the reason I couldn't see the problem is I was using the wrong > query. I happened to look at the 'test' file you had sitting there > and saw that it was a join against one more table than I was using; > there wasn't anything about the 'stock' table in the query you'd sent > by mail. > > Once I had the right query I was able to replicate the problem here. > It's a planning bug. A simplified version is: > > create table t1 (f1 int, f2 int); > create table t2 (f3 int, f4 int); > insert into t1 values(1,10); > insert into t1 values(2,9); > insert into t1 values(3,8); > insert into t1 values(4,7); > insert into t1 values(3,3); > insert into t1 values(3,0); > insert into t1 values(10,0); > insert into t1 values(10,-1); > insert into t2 values(1,1); > insert into t2 values(3,3); > insert into t2 values(2,2); > select * from t1,t2 where f2 = f3 and f1 = f3; > > This should produce one row (of 3's), but will not unless you set > enable_mergejoin to OFF. The problem is that the produced plan > is basically > > Merge Join using "f3 = f1 and f3 = f2" > -> Sort by f3 > -> Seq Scan on t2 > -> Sort by f2 > -> Seq Scan on t1 > > The system knows enough to realize that all the valid output rows > will have f1 = f2 by transitivity, but unfortunately it's then > concluding that it's OK to sort t1 by f2 instead of f1, which is > NOT OK in terms of the ordering the merge needs --- the merge expects > major order by f1 and will miss records if that's not correct. > > I think the proper fix is to gin up an actual WHERE clause "f1 = f2" > and apply it to restrict the output of the seqscan on t1. Then the > output of the sort will indeed have the expected ordering, ie, f1 or f2 > interchangeably. (Actually, the extra WHERE clause might well cause > a different plan to be chosen, because it will give the > restriction-selectivity code information it didn't have before. > But assuming the same plan structure it will work rather than fail.) > > This is a new bug in 7.0.* --- earlier versions didn't have it because > they had no concept of transitive closure of sort keys. Oh well, live > and learn. > > I will work on fixing this in current sources and then see if it's > practical to back-patch it into 7.0.*. In the meantime, I recommend > patching your queries by hand such that all the implied equalities > are mentioned explicitly. That is, instead of > > part_info.item_num = po_line_item.item_num and > parts.item_num = po_line_item.item_num and > stock.item_num = parts.item_num and > > you'd need something like > > part_info.item_num = po_line_item.item_num and > part_info.item_num = parts.item_num and > part_info.item_num = stock.item_num and > parts.item_num = po_line_item.item_num and > stock.item_num = parts.item_num and > stock.item_num = po_line_item.item_num and > > Ugh :-(. Another possibility is "set enable_mergejoin to off" ... > as far as I know, only mergejoin is sufficiently dependent on input > ordering to be bitten by this problem. > > regards, tom lane Have you fixed this in current? If so, are you going to make the back-patch for 7.0.*? It seems the problem is critical... -- Tatsuo Ishii
pgsql-hackers by date: