[PATCH] Fix crash in int8_avg_combine(). - Mailing list pgsql-hackers

From Hadi Moshayedi
Subject [PATCH] Fix crash in int8_avg_combine().
Date
Msg-id CAK=1=WpZLa3Z9txmqzX8ujzXnUHE7_-jWxJBr7d7be+-9Afw6w@mail.gmail.com
Whole thread Raw
Responses Re: [PATCH] Fix crash in int8_avg_combine().  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [PATCH] Fix crash in int8_avg_combine().  (Andres Freund <andres@citusdata.com>)
List pgsql-hackers
While doing some tests on REL_10_STABLE, I was getting run-time exceptions at int8_avg_combine() at the following line:

state1->sumX = state2->sumX;

After some debugging, I noticed that palloc()’s alignment is 8-bytes, while this statement (which moves a __int128 from one memory location to another memory location) expects 16-byte memory alignments. So when either state1 or state2 is not 16-byte aligned, this crashes.

When I disassemble the code, the above statement is translated to a pair of movdqa and movaps assignments when compiled with -O2:

movdqa  c(%rdx), %xmm0
movaps  %xmm0, c(%rcx)

Looking at “Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 2”, both of these instructions expect 16-byte aligned memory locations, or a general-protection exception will be generated.

I wasn’t getting the exception always. For example, the same query that crashed in REL_10_STABLE didn’t crash in master. But I think that was just lucky allocation, and we will eventually see cases of this crash in all versions that use __int128 assignment in int8_avg_combine().

I’ve attached a patch which sets the MAXIMUM_ALIGNOF correctly when __int128’s are available, so fixes the crash. This patch is on top of REL_10_STABLE, which is the branch I was getting the crash at. I can create patches for other branches if we think this is the proper change.

The sequence for which I got the crash in REL_10_STABLE was the following sequence of commands after a fresh initdb:

CREATE TABLE t(a BIGINT);
INSERT INTO t SELECT * FROM generate_series(1, 10000000);
SELECT sum(a) FROM t;

I’m not sure if this will crash for everyone, since you can be lucky and have both states assigned 16-byte aligned locations. This was the case for me in master branch. "SELECT version()" is "PostgreSQL 10.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), 64-bit", if that is related.

I notice that there’s a comment in configure.in that says the penalty of using 16-bit alignment might be too much for disk and memory space. If this is the case, then we need to modify numeric.c to make sure that it never use any instructions with 16-byte alignment requirements. I can do that if that is the consensus.

Or maybe we can create an allocation function for 16-byte aligned allocations.

Any thoughts?

-- Hadi


Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Next
From: Andres Freund
Date:
Subject: Re: [PATCH] Fix crash in int8_avg_combine().