Thread: BUG #17725: Sefault when seg_in() called with a large argument

BUG #17725: Sefault when seg_in() called with a large argument

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17725
Logged by:          Robins Tharakan
Email address:      tharakan@gmail.com
PostgreSQL version: 15.1
Operating system:   Ubuntu 20.04
Description:

Hi,

The following SQL Segfaults on master (tested on b3bb7d12af).

SQL: SELECT seg_in(numeric_out(round(31, 10000)))


Backtrace on ea5ae4cae6@REL_14_STABLE:
=====================================
#0  __strcpy_avx2 () at ../sysdeps/x86_64/multiarch/strcpy-avx2.S:578
#1  0x00007f31c421f4aa in restore (
    result=0x55009893ace0 <error: Cannot access memory at address
0x55009893ace0>, val=31, n=-46) at seg.c:1009
#2  0x00007f31c421dab9 in seg_out (fcinfo=0x7ffe3ddff6c0) at seg.c:135
#3  0x000055d296a40aa9 in FunctionCall1Coll (flinfo=0x55d298735478, 
    collation=0, arg1=94362989160448) at fmgr.c:1138
#4  0x000055d296a42004 in OutputFunctionCall (flinfo=0x55d298735478, 
    val=94362989160448) at fmgr.c:1575
#5  0x000055d29634a8b4 in printtup (slot=0x55d2987344b8,
self=0x55d298936cc0)
    at printtup.c:357
#6  0x000055d2966196c6 in ExecutePlan (estate=0x55d298733f80, 
    planstate=0x55d2987341b8, use_parallel_mode=false, operation=CMD_SELECT,

    sendTuples=true, numberTuples=0, direction=ForwardScanDirection, 
    dest=0x55d298936cc0, execute_once=true) at execMain.c:1582
#7  0x000055d2966172fd in standard_ExecutorRun (queryDesc=0x55d2987289d0, 
    direction=ForwardScanDirection, count=0, execute_once=true)
    at execMain.c:361
#8  0x00007f31dbea134d in pgss_ExecutorRun (queryDesc=0x55d2987289d0, 
    direction=ForwardScanDirection, count=0, execute_once=true)
    at pg_stat_statements.c:1003
#9  0x000055d2966170f3 in ExecutorRun (queryDesc=0x55d2987289d0, 
    direction=ForwardScanDirection, count=0, execute_once=true)
    at execMain.c:303


Backtrace Full excerpt:
======================
#0  __strcpy_avx2 () at ../sysdeps/x86_64/multiarch/strcpy-avx2.S:578
No locals.
#1  0x00007f31c421f4aa in restore (
    result=0x55009893ace0 <error: Cannot access memory at address
0x55009893ace0>, val=31, n=-46) at seg.c:1009
        buf = "00000000003e1\000\060\060\060\060\060\060\060\060\060\060"
        p = 0x55d29893ace8 "e+01"
        exp = 48
        i = 17
        dp = 11
        sign = 0
#2  0x00007f31c421dab9 in seg_out (fcinfo=0x7ffe3ddff6c0) at seg.c:135
        seg = 0x55d29872e800
        result = 0x55d29893ace0 "3.100000e+01"
        p = 0x55d29893ace0 "3.100000e+01"
#3  0x000055d296a40aa9 in FunctionCall1Coll (flinfo=0x55d298735478, 
    collation=0, arg1=94362989160448) at fmgr.c:1138
        fcinfodata = {fcinfo = {flinfo = 0x55d298735478, context = 0x0, 
            resultinfo = 0x0, fncollation = 0, isnull = false, nargs = 1, 
            args = 0x7ffe3ddff6e0}, 
          fcinfo_data = "xTs\230\322U", '\000' <repeats 23 times>,
"U\001\000\000\350r\230\322U\000\000\000m\223\230\322U\000"}
        fcinfo = 0x7ffe3ddff6c0
        result = 94362958816336
        __func__ = "FunctionCall1Coll"
#4  0x000055d296a42004 in OutputFunctionCall (flinfo=0x55d298735478, 
    val=94362989160448) at fmgr.c:1575
No locals.
#5  0x000055d29634a8b4 in printtup (slot=0x55d2987344b8,
self=0x55d298936cc0)
    at printtup.c:357
        outputstr = 0x55d296882235 <check_stack_depth+13> "\204\300td\276"
        thisState = 0x55d298735468
        attr = 94362989160448
        typeinfo = 0x55d2987343a0
        myState = 0x55d298936cc0
        oldcontext = 0x55d298733e60
        buf = 0x55d298936d10
        natts = 1
        i = 0


Error Log:
=========
2022-12-20 02:44:43.728 UTC [633388] LOG:  server process (PID 783919) was
terminated by signal 11: Segmentation fault
2022-12-20 02:44:43.728 UTC [633388] DETAIL:  Failed process was running:
SELECT seg_in(numeric_out(round(31,1000000)));
2022-12-20 02:44:43.728 UTC [633388] LOG:  terminating any other active
server processes

Thanks to SQLSmith / SQLReduce for helping with the find.

-
Robins Tharakan
Amazon Web Services


Re: BUG #17725: Sefault when seg_in() called with a large argument

From
John Naylor
Date:
On Tue, Dec 20, 2022 at 4:28 PM PG Bug reporting form <noreply@postgresql.org> wrote:

> PostgreSQL version: 15.1

> The following SQL Segfaults on master (tested on b3bb7d12af).

> Backtrace on ea5ae4cae6@REL_14_STABLE:

> SQL: SELECT seg_in(numeric_out(round(31, 10000)))

> 2022-12-20 02:44:43.728 UTC [633388] DETAIL:  Failed process was running:
> SELECT seg_in(numeric_out(round(31,1000000)));

Neither query shows the reported problem in my environment on master (as of today) or v14, so not sure 

=# SELECT seg_in(numeric_out(round(31, 10000)));
 seg_in
--------
 3e1
(1 row)

=# SELECT seg_in(numeric_out(round(31,1000000)));
 seg_in
--------
 3e1
(1 row)

It's possibly relevant that this result is different from the "3.100000e+01" which was shown in your backtrace. Since a few details of this report don't agree with each other, I'm starting to wonder if some other relevant details got lost along the way.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: BUG #17725: Sefault when seg_in() called with a large argument

From
Robins Tharakan
Date:
Hi John,

On Tue, 20 Dec 2022 at 20:44, John Naylor <john.naylor@enterprisedb.com> wrote:
> Neither query shows the reported problem in my environment on master (as of today) or v14, so not sure
> It's possibly relevant that this result is different from the "3.100000e+01" which was shown in your backtrace. Since
afew details of this report don't agree with each other, I'm starting to wonder if some other relevant details got lost
alongthe way.
 

Thanks for taking a look and you're possibly correct.

After trying a few combinations, I see that passing
CFLAGS="-Wuninitialized" (default for my test setup) causes this failure.
Removing the flag gives the error you mention, and possibly why this
may not be easy to reproduce on a production system (unsure).

$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

# How I trigger compilation
cd ${sourcepth} && git clean -xdf && ./configure
CFLAGS="-Wuninitialized" --prefix=${installpth} && make -j`nproc`
install ...

This is a recent crash on 69f75bf825@REL_12_STABLE

2022-12-20 10:24:53.361 UTC [3087004] LOG:  server process (PID
3182365) was terminated by signal 11: Segmentation fault
2022-12-20 10:24:53.361 UTC [3087004] DETAIL:  Failed process was
running: SELECT seg_in(numeric_out(round(31, 10000)));
2022-12-20 10:24:53.361 UTC [3087004] LOG:  terminating any other
active server processes
2022-12-20 10:24:53.366 UTC [3087004] LOG:  all server processes
terminated; reinitializing

I created this bug-report since I am able to reproduce this at will. But let
me know if this is uninteresting, or if I can provide any other detail to
help in triaging.

-
robins



Re: BUG #17725: Sefault when seg_in() called with a large argument

From
Tom Lane
Date:
Robins Tharakan <tharakan@gmail.com> writes:
> On Tue, 20 Dec 2022 at 20:44, John Naylor <john.naylor@enterprisedb.com> wrote:
>> Neither query shows the reported problem in my environment on master (as of today) or v14, so not sure

> After trying a few combinations, I see that passing
> CFLAGS="-Wuninitialized" (default for my test setup) causes this failure.
> Removing the flag gives the error you mention, and possibly why this
> may not be easy to reproduce on a production system (unsure).

I don't see a crash either, but I can't help observing that this
input leads to a "seg" struct with "-46" significant digits:

(gdb) p *seg
$3 = {lower = 31, upper = 31, l_sigd = -46 '\322', u_sigd = -46 '\322',
  l_ext = 0 '\000', u_ext = 0 '\000'}

So we're invoking sprintf with a fairly insane precision spec:

939             sprintf(result, "%.*e", n - 1, val);
(gdb) p n
$4 = -46
(gdb) p val
$5 = 31

POSIX says "a negative precision is taken as if the precision were
omitted", and our code seems to do that, but I wonder if this is
managing to overrun the output buffer on your platform.

IMO:

1. The seg grammar needs to constrain the result of significant_digits()
to something that will fit in the allocated "char" field width.
It looks like some code paths there have clamps, but not all.

2. Because we might already have stored "seg" values with bogus
sigd values, restore() had better clamp the "n" value it's given
to something sane.  I see it clamps large positive values, but
it's not worrying about zero-or-negative.

            regards, tom lane



Re: BUG #17725: Sefault when seg_in() called with a large argument

From
Tom Lane
Date:
I wrote:
> I don't see a crash either, but I can't help observing that this
> input leads to a "seg" struct with "-46" significant digits:
> ...
> So we're invoking sprintf with a fairly insane precision spec:

Actually, it looks like sprintf is not the problem.  This is:

(gdb) 
984                                             buf[10 + n] = '\0';
(gdb) p n
$9 = -46

So first off, we're stomping on something we shouldn't, and
secondly we're failing to nul-terminate buf[], which easily
explains your observed crash at the strcpy a little further
down.  On most platforms strcpy would find a nul byte not
too much further on, which might prevent the worst sorts
of damage, but this is still very ugly.

            regards, tom lane