Re: [BUGS] BUG #14897: Segfault on statitics SQL request - Mailing list pgsql-bugs

From Tom Lane
Subject Re: [BUGS] BUG #14897: Segfault on statitics SQL request
Date
Msg-id 10396.1510621709@sss.pgh.pa.us
Whole thread Raw
In response to Re: [BUGS] BUG #14897: Segfault on statitics SQL request  (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses Re: [BUGS] BUG #14897: Segfault on statitics SQL request  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Dmitry Dolgov <9erthalion6@gmail.com> writes:
> I've managed to reproduce this on Ubuntu. I tried multiple gcc versions,
> and for gcc-5.4 everything is fine, for gcc-6.4 there is an issue.

I found out I can reproduce this on Fedora 26 as well; I'd been
fat-fingering the test somehow.

The place where it is actually crashing for me is in int8_avg_combine(),
and *the struct that is misaligned is one that has been passed back from
a worker process*.  This means that the problem is even worse than we
thought before, because there is no reasonable way that the code that
palloc'd that struct within the current process would know that it needs
more than MAXALIGN alignment.  It is not hard to think of other cases
where an int128-containing struct might get passed through code that
doesn't know what is in it.

So what I am now thinking is that the only practical answer is to stop
gcc from believing that it is safe to use 16-aligned instructions on
int128's.  (Some reading on the net suggests that the actual performance
penalty for that is minimal anyway on modern Intel chips.)

It looks like we can do that with very little code impact by attaching
alignment pragmas to the typedefs for [u]int128:

typedef PG_INT128_TYPE int128
#if defined(pg_attribute_aligned)
pg_attribute_aligned(8)
#endif
;
typedef unsigned PG_INT128_TYPE uint128
#if defined(pg_attribute_aligned)
pg_attribute_aligned(8)
#endif
;

When I do that, the coredump goes away for me.  Moreover, this seems
like an easily back-patchable fix, and we do need to do something
back to 9.5.

Now, one small problem is that c.h defines pg_attribute_aligned further
down than where these typedefs currently appear.  This seems to me to
indicate that we ought to rethink the contents, or at least the ordering,
of the "sections" in c.h.  I wonder if anyone has any strong opinions
about exactly how to do that.
        regards, tom lane


pgsql-bugs by date:

Previous
From: Neil Anderson
Date:
Subject: Re: BUG #14900: MView not null constraint
Next
From: xkevin.hk@gmail.com
Date:
Subject: BUG #14905: can't restart successfully because of existed pid file ofhttpd