On Wed, Jun 26, 2019 at 09:49:46AM +0200, Tomas Vondra wrote:
>On Tue, Jun 25, 2019 at 11:52:28PM -0400, Tom Lane wrote:
>>I'm seeing a reproducible bus error here:
>>
>>#0 0x00417420 in statext_mcv_serialize (mcvlist=0x62223450, stats=Variable "stats" is not available.
>>)
>> at mcv.c:785
>>785 memcpy(ITEM_BASE_FREQUENCY(item, ndims), &mcvitem->base_frequency, sizeof(double));
>>
>>What appears to be happening is that since ITEM_BASE_FREQUENCY is defined as
>>
>>#define ITEM_BASE_FREQUENCY(item,ndims) ((double *) (ITEM_FREQUENCY(item, ndims) + 1))
>>
>>the compiler is assuming that the first argument to memcpy is
>>double-aligned, and it is generating code that depends on that being
>>true, and of course it isn't true and kaboom.
>>
>>You can *not* cast something to an aligned pointer type if it's not
>>actually certain to be aligned suitably for that type. In this example,
>>even if you wrote "(char *)" in front of this, it wouldn't save you;
>>the compiler would still be entitled to believe that the intermediate
>>cast value meant something. The casts in the underlying macros
>>ITEM_FREQUENCY and so on are equally unsafe.
>>
>
>OK. So the solution is to ditch the casts altogether, and then do plain
>pointer arithmetics like this:
>
>#define ITEM_INDEXES(item) (item)
>#define ITEM_NULLS(item,ndims) (ITEM_INDEXES(item) + (ndims))
>#define ITEM_FREQUENCY(item,ndims) (ITEM_NULLS(item, ndims) + (ndims))
>#define ITEM_BASE_FREQUENCY(item,ndims) (ITEM_FREQUENCY(item, ndims) + sizeof(double))
>
>Or is that still relying on alignment, somehow?
>
Attached is a patch that should (hopefully) fix this. It essentially
treats the item as (char *) and does all pointer arithmetics without any
additional casts. So there are no intermediate casts.
I have no way to test this, so I may either wait for you to test this
first, or push and wait. It seems to fail only on a very small number of
buildfarm animals, so having a confirmation would be nice.
The fix keeps the binary format as is, so the serialized MCV items are
max-aligned. That means we can access the uint16 indexes directly, but we
need to copy the rest of the fields (because those may not be aligned). In
hindsight that seems a bit silly, we might as well copy everything, not
care about the alignment and maybe save a few more bytes. But that would
require catversion bump. OTOH we may beed to do that anyway, to fix the
pg_mcv_list_items() signature (as discussed in the other MCV thread).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services