Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)
Date
Msg-id 8181.1521310826@sss.pgh.pa.us
Whole thread Raw
In response to Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)  (Andres Freund <andres@anarazel.de>)
Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Not sure, but the backbranches seem to be working fine, and the commit
> that triggers the issue is from December 31. Maybe the issue was there
> but we were lucky not to trip on it before.

Yeah, we were simply not testing that overflow-detection code before.
Undoubtedly it would fail in the back branches too if we tested it.

> Anyway, I can confirm that the fix suggested by Tom does the trick
> (well, at least on Fulmar, which is running icc 14.0.3). I've also
> disassembled exec_stmt_fori both with and without the patch - reading
> assembly in not my strength, but if you're interested it's attached. The
> interesting part seems to be the last ~50 lines or so.

Hm, did you get the "master" and "patched" versions backwards?  The
allegedly-patched version does the !reverse case like this:

   0x00007f71219457ae <+2200>:    mov    -0x108(%rbp),%eax
   0x00007f71219457b4 <+2206>:    test   %eax,%eax
   0x00007f71219457b6 <+2208>:    jl     0x7f71219457cf <exec_stmt_fori+2233>
   0x00007f71219457b8 <+2210>:    mov    -0x108(%rbp),%eax
   0x00007f71219457be <+2216>:    add    -0x110(%rbp),%eax
   0x00007f71219457c4 <+2222>:    mov    %eax,-0x110(%rbp)
   0x00007f71219457ca <+2228>:    jmpq   0x7f7121945573 <exec_stmt_fori+1629>

so that it's apparently optimized

            if ((int32) (loop_value + step_value) < loop_value)
                break;

into

            if (step_value < 0)
                break;

which of course never exits the loop.  That's slightly different
(and stupider) than what I'd been hypothesizing, but it's a valid
transformation if you ignore the possibility of integer overflow.

It might be worth studying the icc manual to see if it has an
equivalent of -fwrapv.  Although we can and probably should fix
this case by changing the code, I'm worried about what other bugs
may exist only in icc builds.  I know Andres would like to get
rid of the need for -fwrapv but I suspect that's not really going
to happen soon.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)
Next
From: Dean Rasheed
Date:
Subject: Re: MCV lists for highly skewed distributions