Thread: seg regression failures

seg regression failures

From
Magnus Hagander
Date:
contrib/seg fails regression tests on my msvc build (reason Ididn't notice
earlier is probably that you couldn't run contribcheck on the vcbuild).
regression.diff attached.

Anybody see a smoking gun? I don't really know what seg is supposed to be
doing, so I don't know where to start.

(had to send this to -patches, because -hackers denied the attachment..)

//Magnus


Attachment

Re: seg regression failures

From
Teodor Sigaev
Date:
> Anybody see a smoking gun? I don't really know what seg is supposed to be
> doing, so I don't know where to start.

As crazy idea only: try to modify seg_overlap to
bool
seg_overlap(SEG * a, SEG * b)
{
     return (
             ((a->upper >= b->upper) && (a->lower <= b->upper))
             ||
             ((b->upper >= a->upper) && (b->lower <= a->upper))
         ) ? true : false;
}


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: seg regression failures

From
Magnus Hagander
Date:
On Fri, Mar 23, 2007 at 01:15:10PM +0300, Teodor Sigaev wrote:
> >Anybody see a smoking gun? I don't really know what seg is supposed to be
> >doing, so I don't know where to start.
>
> As crazy idea only: try to modify seg_overlap to
> bool
> seg_overlap(SEG * a, SEG * b)
> {
>     return (
>             ((a->upper >= b->upper) && (a->lower <= b->upper))
>             ||
>             ((b->upper >= a->upper) && (b->lower <= a->upper))
>         ) ? true : false;
> }

Nope, no difference.

//Magnus

Re: seg regression failures

From
Andrew Dunstan
Date:
Magnus Hagander wrote:
> contrib/seg fails regression tests on my msvc build (reason Ididn't notice
> earlier is probably that you couldn't run contribcheck on the vcbuild).
> regression.diff attached.
>
> Anybody see a smoking gun? I don't really know what seg is supposed to be
> doing, so I don't know where to start.
>
> (had to send this to -patches, because -hackers denied the attachment..)
>
>
[snip]

I think I would have put the diff up somewhere on a web site, although i
guess this way we have a record of it.

Anyway, I think you probably need to load up the old debugger  and put a
break point in the overlap function ... surely the error can't be in
float4in or else we'd have seen other regression problems.

cheers

andrew

Re: seg regression failures

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Anyway, I think you probably need to load up the old debugger  and put a
> break point in the overlap function ... surely the error can't be in
> float4in or else we'd have seen other regression problems.

One of the failing test cases is for seg_over_left, which is so trivial
that it's pretty hard to believe even a Microsoft compiler could get it
wrong ;-).  My bet is something wrong in segparse.y or possibly
segscan.l, leading to garbage seg values being produced.  We've seen
segparse.y trigger compiler bugs before --- look at its CVS history.

            regards, tom lane

Re: seg regression failures

From
Magnus Hagander
Date:
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Anyway, I think you probably need to load up the old debugger  and put a
>> break point in the overlap function ... surely the error can't be in
>> float4in or else we'd have seen other regression problems.
>
> One of the failing test cases is for seg_over_left, which is so trivial
> that it's pretty hard to believe even a Microsoft compiler could get it
> wrong ;-).  My bet is something wrong in segparse.y or possibly
> segscan.l, leading to garbage seg values being produced.  We've seen
> segparse.y trigger compiler bugs before --- look at its CVS history.

Loading up the debugger (I'm so happy not to have gdb-win32 now :-P), I
get some interesting stuff. My test case is "SELECT '1'::seg &&
'2'::seg" which should return false, but returns true.

I also tried seg_overlap('1'::seg,'2'::seg) to make the path easier, but
same problem - still returns 't' when it should return'f'.

The SEG parameters going into seg_overlap() look perfectly correct, and
seg_overlap() actually returns 0. But this is somehow later turned into
't'. Any pointers for where to look for how that happens?

For example, calling seg_same('1'::seg,'2'::seg) works fine, so it's not
something in general with bool functions, or with the input functions.
In fact, those two functions looks very very similar to me.

//Magnus

Re: seg regression failures

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> The SEG parameters going into seg_overlap() look perfectly correct, and
> seg_overlap() actually returns 0. But this is somehow later turned into
> 't'. Any pointers for where to look for how that happens?

I'll betcha that MSVC is generating code that only sets the low-order
byte of the return register (EAX likely) where GCC tends to set the
whole register.  So when the returned value is taken as a Datum, it
might contain some garbage.

Seems like we need to either reconsider the definition of DatumGetBool,
or decree that old-style functions returning bool are broken.

I'm a bit surprised this hasn't come up before, actually, since it seems
like it could happen on a lot of architectures.  Fixing DatumGetBool is
probably the right thing to do.

-#define DatumGetBool(X) ((bool) (((Datum) (X)) != 0))
+#define DatumGetBool(X) ((bool) (((bool) (X)) != 0))

            regards, tom lane

Re: seg regression failures

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> The SEG parameters going into seg_overlap() look perfectly correct, and
>> seg_overlap() actually returns 0. But this is somehow later turned into
>> 't'. Any pointers for where to look for how that happens?
>
> I'll betcha that MSVC is generating code that only sets the low-order
> byte of the return register (EAX likely) where GCC tends to set the
> whole register.  So when the returned value is taken as a Datum, it
> might contain some garbage.

That looks very consistent with the data in my debugger (returned value
is not zero for the whole datum, but casted to bool it's fine).

I'm just going to clean up a compile error I introduced and try your
suggestion below, and I'll let you know how it works out.

//Magnus

Re: seg regression failures

From
Magnus Hagander
Date:
Magnus Hagander wrote:
> Tom Lane wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>> The SEG parameters going into seg_overlap() look perfectly correct, and
>>> seg_overlap() actually returns 0. But this is somehow later turned into
>>> 't'. Any pointers for where to look for how that happens?
>> I'll betcha that MSVC is generating code that only sets the low-order
>> byte of the return register (EAX likely) where GCC tends to set the
>> whole register.  So when the returned value is taken as a Datum, it
>> might contain some garbage.
>
> That looks very consistent with the data in my debugger (returned value
> is not zero for the whole datum, but casted to bool it's fine).
>
> I'm just going to clean up a compile error I introduced and try your
> suggestion below, and I'll let you know how it works out.

Yup, that was indeed the problem. I assume you'll do the honors.

//Magnus

Re: seg regression failures

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Yup, that was indeed the problem. I assume you'll do the honors.

Yeah, the comments need fixed too.  I'll do it later today.

            regards, tom lane

Re: seg regression failures

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Yup, that was indeed the problem. I assume you'll do the honors.
>
> Yeah, the comments need fixed too.  I'll do it later today.

Thanks.

//Magnus