Re: SerializeParamList vs machines with strict alignment - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: SerializeParamList vs machines with strict alignment |
Date | |
Msg-id | CAA4eK1JfEc=cqiUiRvUcHYCf=PVEwM_bZ_QiOMHpqGpUMdY8gA@mail.gmail.com Whole thread Raw |
In response to | SerializeParamList vs machines with strict alignment (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: SerializeParamList vs machines with strict alignment
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-hackers |
On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wondered why buildfarm member chipmunk has been failing hard > for the last little while. Fortunately, it's supplying us with > a handy backtrace: > > Program terminated with signal 7, Bus error. > #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 > 329 aresult->dataoffset = dataoffset; > #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 > #1 EA_flatten_into (eohptr=0x188f440, result=0xb55ff30e, allocated_size=<optimized out>) at array_expanded.c:293 > #2 0x003c3dfc in EOH_flatten_into (eohptr=<optimized out>, result=<optimized out>, allocated_size=<optimized out>) atexpandeddatum.c:84 > #3 0x003c076c in datumSerialize (value=3934060, isnull=<optimized out>, typByVal=<optimized out>, typLen=<optimized out>,start_address=0xbea3bd54) at datum.c:341 > #4 0x002a8510 in SerializeParamList (paramLI=0x1889f18, start_address=0xbea3bd54) at params.c:195 > #5 0x002342cc in ExecInitParallelPlan (planstate=0xffffffff, estate=0x18863e0, sendParams=0x46e, nworkers=1, tuples_needed=-1)at execParallel.c:700 > #6 0x002461dc in ExecGather (pstate=0x18864f0) at nodeGather.c:151 > #7 0x00236b20 in ExecProcNodeFirst (node=0x18864f0) at execProcnode.c:445 > #8 0x0022fc2c in ExecProcNode (node=0x18864f0) at ../../../src/include/executor/executor.h:237 > #9 ExecutePlan (execute_once=<optimized out>, dest=0x188a108, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x18864f0, estate=0x18863e0)at execMain.c:1721 > #10 standard_ExecutorRun (queryDesc=0x188a138, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362 > #11 0x0023d630 in postquel_getnext (fcache=0x1888408, es=0x1889d68) at functions.c:867 > #12 fmgr_sql (fcinfo=0x701c7c) at functions.c:1164 > > This is remarkably hard to replicate on other machines, but I eventually > managed to duplicate it on gaur's host, after which it became really > obvious that the parallel-query data transfer logic has never been > stressed very hard on machines with strict data alignment rules. > > In particular, SerializeParamList does this: > > /* Write flags. */ > memcpy(*start_address, &prm->pflags, sizeof(uint16)); > *start_address += sizeof(uint16); > > immediately followed by this: > > datumSerialize(prm->value, prm->isnull, typByVal, typLen, > start_address); > > and datumSerialize might do this: > > EOH_flatten_into(eoh, (void *) *start_address, header); > > Now, I will plead mea culpa that the expanded-object API doesn't > say in large red letters that the target address for EOH_flatten_into > is supposed to be maxaligned. It only says > > * The flattened representation must be a valid in-line, non-compressed, > * 4-byte-header varlena object. > > Still, one might reasonably suspect from that that *at least* 4-byte > alignment is expected. > datumSerialize does this: memcpy(*start_address, &header, sizeof(int)); *start_address += sizeof(int); before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned. > This code path isn't providing such alignment, > and machines that require it will crash. Yeah, I think as suggested by you, start_address should be maxaligned. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: