Re: Re: lost records --- problem identified! - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Re: lost records --- problem identified!
Date
Msg-id 20000822215722O.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: lost records --- problem identified!  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Re: lost records --- problem identified!
List pgsql-hackers
> Ah, the reason I couldn't see the problem is I was using the wrong
> query.  I happened to look at the 'test' file you had sitting there
> and saw that it was a join against one more table than I was using;
> there wasn't anything about the 'stock' table in the query you'd sent
> by mail.
> 
> Once I had the right query I was able to replicate the problem here.
> It's a planning bug.  A simplified version is:
> 
> create table t1 (f1 int, f2 int);
> create table t2 (f3 int, f4 int);
> insert into t1 values(1,10);
> insert into t1 values(2,9);
> insert into t1 values(3,8);
> insert into t1 values(4,7);
> insert into t1 values(3,3);
> insert into t1 values(3,0);
> insert into t1 values(10,0);
> insert into t1 values(10,-1);
> insert into t2 values(1,1);
> insert into t2 values(3,3);
> insert into t2 values(2,2);
> select * from t1,t2 where f2 = f3 and f1 = f3;
> 
> This should produce one row (of 3's), but will not unless you set
> enable_mergejoin to OFF.  The problem is that the produced plan
> is basically
> 
> Merge Join using "f3 = f1 and f3 = f2"
>   ->  Sort by f3
>         ->  Seq Scan on t2
>   ->  Sort by f2
>         ->  Seq Scan on t1
> 
> The system knows enough to realize that all the valid output rows
> will have f1 = f2 by transitivity, but unfortunately it's then
> concluding that it's OK to sort t1 by f2 instead of f1, which is
> NOT OK in terms of the ordering the merge needs --- the merge expects
> major order by f1 and will miss records if that's not correct.
> 
> I think the proper fix is to gin up an actual WHERE clause "f1 = f2"
> and apply it to restrict the output of the seqscan on t1.  Then the
> output of the sort will indeed have the expected ordering, ie, f1 or f2
> interchangeably.  (Actually, the extra WHERE clause might well cause
> a different plan to be chosen, because it will give the
> restriction-selectivity code information it didn't have before.
> But assuming the same plan structure it will work rather than fail.)
> 
> This is a new bug in 7.0.* --- earlier versions didn't have it because
> they had no concept of transitive closure of sort keys.  Oh well, live
> and learn.
> 
> I will work on fixing this in current sources and then see if it's
> practical to back-patch it into 7.0.*.  In the meantime, I recommend
> patching your queries by hand such that all the implied equalities
> are mentioned explicitly.  That is, instead of
> 
>      part_info.item_num = po_line_item.item_num and 
>      parts.item_num = po_line_item.item_num and 
>      stock.item_num = parts.item_num and
> 
> you'd need something like
> 
>      part_info.item_num = po_line_item.item_num and 
>      part_info.item_num = parts.item_num and
>      part_info.item_num = stock.item_num and
>      parts.item_num = po_line_item.item_num and 
>      stock.item_num = parts.item_num and
>      stock.item_num = po_line_item.item_num and
> 
> Ugh :-(.  Another possibility is "set enable_mergejoin to off" ...
> as far as I know, only mergejoin is sufficiently dependent on input
> ordering to be bitten by this problem.
> 
>             regards, tom lane

Have you fixed this in current? If so, are you going to make the
back-patch for 7.0.*?

It seems the problem is critical...
--
Tatsuo Ishii



pgsql-hackers by date:

Previous
From: "Mark Hollomon"
Date:
Subject: Re: functional index arg matching patch
Next
From: Tom Lane
Date:
Subject: Re: Re: lost records --- problem identified!