Thread: Re: lost records --- problem identified!

Re: lost records --- problem identified!

From
Tom Lane
Date:
Ah, the reason I couldn't see the problem is I was using the wrong
query.  I happened to look at the 'test' file you had sitting there
and saw that it was a join against one more table than I was using;
there wasn't anything about the 'stock' table in the query you'd sent
by mail.

Once I had the right query I was able to replicate the problem here.
It's a planning bug.  A simplified version is:

create table t1 (f1 int, f2 int);
create table t2 (f3 int, f4 int);
insert into t1 values(1,10);
insert into t1 values(2,9);
insert into t1 values(3,8);
insert into t1 values(4,7);
insert into t1 values(3,3);
insert into t1 values(3,0);
insert into t1 values(10,0);
insert into t1 values(10,-1);
insert into t2 values(1,1);
insert into t2 values(3,3);
insert into t2 values(2,2);
select * from t1,t2 where f2 = f3 and f1 = f3;

This should produce one row (of 3's), but will not unless you set
enable_mergejoin to OFF.  The problem is that the produced plan
is basically

Merge Join using "f3 = f1 and f3 = f2" ->  Sort by f3       ->  Seq Scan on t2 ->  Sort by f2       ->  Seq Scan on t1

The system knows enough to realize that all the valid output rows
will have f1 = f2 by transitivity, but unfortunately it's then
concluding that it's OK to sort t1 by f2 instead of f1, which is
NOT OK in terms of the ordering the merge needs --- the merge expects
major order by f1 and will miss records if that's not correct.

I think the proper fix is to gin up an actual WHERE clause "f1 = f2"
and apply it to restrict the output of the seqscan on t1.  Then the
output of the sort will indeed have the expected ordering, ie, f1 or f2
interchangeably.  (Actually, the extra WHERE clause might well cause
a different plan to be chosen, because it will give the
restriction-selectivity code information it didn't have before.
But assuming the same plan structure it will work rather than fail.)

This is a new bug in 7.0.* --- earlier versions didn't have it because
they had no concept of transitive closure of sort keys.  Oh well, live
and learn.

I will work on fixing this in current sources and then see if it's
practical to back-patch it into 7.0.*.  In the meantime, I recommend
patching your queries by hand such that all the implied equalities
are mentioned explicitly.  That is, instead of
    part_info.item_num = po_line_item.item_num and     parts.item_num = po_line_item.item_num and     stock.item_num =
parts.item_numand
 

you'd need something like
    part_info.item_num = po_line_item.item_num and     part_info.item_num = parts.item_num and    part_info.item_num =
stock.item_numand    parts.item_num = po_line_item.item_num and     stock.item_num = parts.item_num and
stock.item_num= po_line_item.item_num and
 

Ugh :-(.  Another possibility is "set enable_mergejoin to off" ...
as far as I know, only mergejoin is sufficiently dependent on input
ordering to be bitten by this problem.
        regards, tom lane


Re: Re: lost records --- problem identified!

From
Tatsuo Ishii
Date:
> Ah, the reason I couldn't see the problem is I was using the wrong
> query.  I happened to look at the 'test' file you had sitting there
> and saw that it was a join against one more table than I was using;
> there wasn't anything about the 'stock' table in the query you'd sent
> by mail.
> 
> Once I had the right query I was able to replicate the problem here.
> It's a planning bug.  A simplified version is:
> 
> create table t1 (f1 int, f2 int);
> create table t2 (f3 int, f4 int);
> insert into t1 values(1,10);
> insert into t1 values(2,9);
> insert into t1 values(3,8);
> insert into t1 values(4,7);
> insert into t1 values(3,3);
> insert into t1 values(3,0);
> insert into t1 values(10,0);
> insert into t1 values(10,-1);
> insert into t2 values(1,1);
> insert into t2 values(3,3);
> insert into t2 values(2,2);
> select * from t1,t2 where f2 = f3 and f1 = f3;
> 
> This should produce one row (of 3's), but will not unless you set
> enable_mergejoin to OFF.  The problem is that the produced plan
> is basically
> 
> Merge Join using "f3 = f1 and f3 = f2"
>   ->  Sort by f3
>         ->  Seq Scan on t2
>   ->  Sort by f2
>         ->  Seq Scan on t1
> 
> The system knows enough to realize that all the valid output rows
> will have f1 = f2 by transitivity, but unfortunately it's then
> concluding that it's OK to sort t1 by f2 instead of f1, which is
> NOT OK in terms of the ordering the merge needs --- the merge expects
> major order by f1 and will miss records if that's not correct.
> 
> I think the proper fix is to gin up an actual WHERE clause "f1 = f2"
> and apply it to restrict the output of the seqscan on t1.  Then the
> output of the sort will indeed have the expected ordering, ie, f1 or f2
> interchangeably.  (Actually, the extra WHERE clause might well cause
> a different plan to be chosen, because it will give the
> restriction-selectivity code information it didn't have before.
> But assuming the same plan structure it will work rather than fail.)
> 
> This is a new bug in 7.0.* --- earlier versions didn't have it because
> they had no concept of transitive closure of sort keys.  Oh well, live
> and learn.
> 
> I will work on fixing this in current sources and then see if it's
> practical to back-patch it into 7.0.*.  In the meantime, I recommend
> patching your queries by hand such that all the implied equalities
> are mentioned explicitly.  That is, instead of
> 
>      part_info.item_num = po_line_item.item_num and 
>      parts.item_num = po_line_item.item_num and 
>      stock.item_num = parts.item_num and
> 
> you'd need something like
> 
>      part_info.item_num = po_line_item.item_num and 
>      part_info.item_num = parts.item_num and
>      part_info.item_num = stock.item_num and
>      parts.item_num = po_line_item.item_num and 
>      stock.item_num = parts.item_num and
>      stock.item_num = po_line_item.item_num and
> 
> Ugh :-(.  Another possibility is "set enable_mergejoin to off" ...
> as far as I know, only mergejoin is sufficiently dependent on input
> ordering to be bitten by this problem.
> 
>             regards, tom lane

Have you fixed this in current? If so, are you going to make the
back-patch for 7.0.*?

It seems the problem is critical...
--
Tatsuo Ishii



Re: Re: lost records --- problem identified!

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Have you fixed this in current? If so, are you going to make the
> back-patch for 7.0.*?

Yes, and no.  I think the patch would be too risky to drop into 7.0.*
with no beta testing...
        regards, tom lane


Re: Re: lost records --- problem identified!

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > Have you fixed this in current? If so, are you going to make the
> > back-patch for 7.0.*?
> 
> Yes, and no.  I think the patch would be too risky to drop into 7.0.*
> with no beta testing...

I think we need beta testing then. Otherwise we would have to tell
people to disable merge join until 7.1 gets released, I guess.
--
Tatsuo Ishii


Re: Re: lost records --- problem identified!

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> Yes, and no.  I think the patch would be too risky to drop into 7.0.*
>> with no beta testing...

> I think we need beta testing then. Otherwise we would have to tell
> people to disable merge join until 7.1 gets released, I guess.

Well ... mumble ... as far as I've heard, you and Frank are the only
ones to get bit by it so far.  So I'm not sure it deserves a fire-drill
response.  There is at least one other "fatal" bug in 7.0 mergejoin
(planner will accept an inner plan of a type that doesn't support mark/
restore; this is still not fixed in current sources) not to mention a
fatal bug in hashjoin (hash table portal may get released before cursor
portal; this is fixed in current sources but I see no practical way to
fix it under 7.0 memory management).  Is it worth trying to beta-test
our way to a 7.0.3 release?  I'd vote for pushing up the 7.1 release
schedule instead, if we are sufficiently worried about these issues.
        regards, tom lane