Thread: SerializeParamList vs machines with strict alignment
I wondered why buildfarm member chipmunk has been failing hard for the last little while. Fortunately, it's supplying us with a handy backtrace: Program terminated with signal 7, Bus error. #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 329 aresult->dataoffset = dataoffset; #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 #1 EA_flatten_into (eohptr=0x188f440, result=0xb55ff30e, allocated_size=<optimized out>) at array_expanded.c:293 #2 0x003c3dfc in EOH_flatten_into (eohptr=<optimized out>, result=<optimized out>, allocated_size=<optimized out>) at expandeddatum.c:84 #3 0x003c076c in datumSerialize (value=3934060, isnull=<optimized out>, typByVal=<optimized out>, typLen=<optimized out>,start_address=0xbea3bd54) at datum.c:341 #4 0x002a8510 in SerializeParamList (paramLI=0x1889f18, start_address=0xbea3bd54) at params.c:195 #5 0x002342cc in ExecInitParallelPlan (planstate=0xffffffff, estate=0x18863e0, sendParams=0x46e, nworkers=1, tuples_needed=-1)at execParallel.c:700 #6 0x002461dc in ExecGather (pstate=0x18864f0) at nodeGather.c:151 #7 0x00236b20 in ExecProcNodeFirst (node=0x18864f0) at execProcnode.c:445 #8 0x0022fc2c in ExecProcNode (node=0x18864f0) at ../../../src/include/executor/executor.h:237 #9 ExecutePlan (execute_once=<optimized out>, dest=0x188a108, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x18864f0, estate=0x18863e0)at execMain.c:1721 #10 standard_ExecutorRun (queryDesc=0x188a138, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362 #11 0x0023d630 in postquel_getnext (fcache=0x1888408, es=0x1889d68) at functions.c:867 #12 fmgr_sql (fcinfo=0x701c7c) at functions.c:1164 This is remarkably hard to replicate on other machines, but I eventually managed to duplicate it on gaur's host, after which it became really obvious that the parallel-query data transfer logic has never been stressed very hard on machines with strict data alignment rules. In particular, SerializeParamList does this: /* Write flags. */ memcpy(*start_address, &prm->pflags, sizeof(uint16)); *start_address += sizeof(uint16); immediately followed by this: datumSerialize(prm->value, prm->isnull, typByVal, typLen, start_address); and datumSerialize might do this: EOH_flatten_into(eoh, (void *) *start_address, header); Now, I will plead mea culpa that the expanded-object API doesn't say in large red letters that the target address for EOH_flatten_into is supposed to be maxaligned. It only says * The flattened representation must be a valid in-line, non-compressed, * 4-byte-header varlena object. Still, one might reasonably suspect from that that *at least* 4-byte alignment is expected. This code path isn't providing such alignment, and machines that require it will crash. The only reason we've not noticed, AFAICS, is that nobody has been running with force_parallel_mode = regress on alignment-picky hardware. regards, tom lane
On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wondered why buildfarm member chipmunk has been failing hard > for the last little while. Fortunately, it's supplying us with > a handy backtrace: > > Program terminated with signal 7, Bus error. > #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 > 329 aresult->dataoffset = dataoffset; > #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 > #1 EA_flatten_into (eohptr=0x188f440, result=0xb55ff30e, allocated_size=<optimized out>) at array_expanded.c:293 > #2 0x003c3dfc in EOH_flatten_into (eohptr=<optimized out>, result=<optimized out>, allocated_size=<optimized out>) atexpandeddatum.c:84 > #3 0x003c076c in datumSerialize (value=3934060, isnull=<optimized out>, typByVal=<optimized out>, typLen=<optimized out>,start_address=0xbea3bd54) at datum.c:341 > #4 0x002a8510 in SerializeParamList (paramLI=0x1889f18, start_address=0xbea3bd54) at params.c:195 > #5 0x002342cc in ExecInitParallelPlan (planstate=0xffffffff, estate=0x18863e0, sendParams=0x46e, nworkers=1, tuples_needed=-1)at execParallel.c:700 > #6 0x002461dc in ExecGather (pstate=0x18864f0) at nodeGather.c:151 > #7 0x00236b20 in ExecProcNodeFirst (node=0x18864f0) at execProcnode.c:445 > #8 0x0022fc2c in ExecProcNode (node=0x18864f0) at ../../../src/include/executor/executor.h:237 > #9 ExecutePlan (execute_once=<optimized out>, dest=0x188a108, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x18864f0, estate=0x18863e0)at execMain.c:1721 > #10 standard_ExecutorRun (queryDesc=0x188a138, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362 > #11 0x0023d630 in postquel_getnext (fcache=0x1888408, es=0x1889d68) at functions.c:867 > #12 fmgr_sql (fcinfo=0x701c7c) at functions.c:1164 > > This is remarkably hard to replicate on other machines, but I eventually > managed to duplicate it on gaur's host, after which it became really > obvious that the parallel-query data transfer logic has never been > stressed very hard on machines with strict data alignment rules. > > In particular, SerializeParamList does this: > > /* Write flags. */ > memcpy(*start_address, &prm->pflags, sizeof(uint16)); > *start_address += sizeof(uint16); > > immediately followed by this: > > datumSerialize(prm->value, prm->isnull, typByVal, typLen, > start_address); > > and datumSerialize might do this: > > EOH_flatten_into(eoh, (void *) *start_address, header); > > Now, I will plead mea culpa that the expanded-object API doesn't > say in large red letters that the target address for EOH_flatten_into > is supposed to be maxaligned. It only says > > * The flattened representation must be a valid in-line, non-compressed, > * 4-byte-header varlena object. > > Still, one might reasonably suspect from that that *at least* 4-byte > alignment is expected. > datumSerialize does this: memcpy(*start_address, &header, sizeof(int)); *start_address += sizeof(int); before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned. > This code path isn't providing such alignment, > and machines that require it will crash. Yeah, I think as suggested by you, start_address should be maxaligned. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In particular, SerializeParamList does this: >> >> /* Write flags. */ >> memcpy(*start_address, &prm->pflags, sizeof(uint16)); >> *start_address += sizeof(uint16); >> >> immediately followed by this: >> >> datumSerialize(prm->value, prm->isnull, typByVal, typLen, >> start_address); > datumSerialize does this: > memcpy(*start_address, &header, sizeof(int)); > *start_address += sizeof(int); > before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned. But that doesn't undo the fact that you're now on an odd halfword boundary. In the case I observed, EA_flatten_into was trying to store an int32 through a pointer whose hex value ended in E, which is explained by the "+= sizeof(uint16)". > Yeah, I think as suggested by you, start_address should be maxaligned. A localized fix would be for datumSerialize to temporarily palloc the space for EOH_flatten_into to write into, and then it could memcpy that to whatever random address it'd been passed. Seems kind of inefficient though, especially since you'd likely have to do the same thing on the receiving side. A larger issue is that even on machines where misalignment is permitted, memcpy'ing to/from misaligned addresses is rather inefficient. I think it'd be better to redesign the whole thing so that every field is maxaligned, and you can e.g. just store and retrieve integer fields as integers without a memcpy. The "wasted" padding space doesn't seem very significant given the short-lived nature of the serialized data. However, that's not exactly a trivial change. Probably the best way forward is to adopt the extra-palloc solution as a back-patchable fix, and then work on a more performant solution in HEAD only. regards, tom lane
On Mon, Sep 10, 2018 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> In particular, SerializeParamList does this: > >> > >> /* Write flags. */ > >> memcpy(*start_address, &prm->pflags, sizeof(uint16)); > >> *start_address += sizeof(uint16); > >> > >> immediately followed by this: > >> > >> datumSerialize(prm->value, prm->isnull, typByVal, typLen, > >> start_address); > > > datumSerialize does this: > > > memcpy(*start_address, &header, sizeof(int)); > > *start_address += sizeof(int); > > > before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned. > > But that doesn't undo the fact that you're now on an odd halfword > boundary. In the case I observed, EA_flatten_into was trying to > store an int32 through a pointer whose hex value ended in E, which > is explained by the "+= sizeof(uint16)". > > > Yeah, I think as suggested by you, start_address should be maxaligned. > > A localized fix would be for datumSerialize to temporarily palloc the > space for EOH_flatten_into to write into, and then it could memcpy > that to whatever random address it'd been passed. Attached is a patch along these lines, let me know if you have something else in mind? I have manually verified this with force_parallel_mode=regress configuration in my development environment. I don't have access to alignment-sensitive hardware, so can't test in such an environment. I will see if I can write a test as well. > Seems kind of > inefficient though, especially since you'd likely have to do the same > thing on the receiving side. > I am not sure what exactly you have in mind for receiving side. datumRestore does below for pass-by-reference values: /* Pass-by-reference case; copy indicated number of bytes. */ Assert(header > 0); d = palloc(header); memcpy(d, *start_address, header); *start_address += header; return PointerGetDatum(d); Do we need any other special handling here? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
On Sun, Sep 30, 2018 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Sep 10, 2018 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > > On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >> In particular, SerializeParamList does this: > > >> > > >> /* Write flags. */ > > >> memcpy(*start_address, &prm->pflags, sizeof(uint16)); > > >> *start_address += sizeof(uint16); > > >> > > >> immediately followed by this: > > >> > > >> datumSerialize(prm->value, prm->isnull, typByVal, typLen, > > >> start_address); > > > > > datumSerialize does this: > > > > > memcpy(*start_address, &header, sizeof(int)); > > > *start_address += sizeof(int); > > > > > before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned. > > > > But that doesn't undo the fact that you're now on an odd halfword > > boundary. In the case I observed, EA_flatten_into was trying to > > store an int32 through a pointer whose hex value ended in E, which > > is explained by the "+= sizeof(uint16)". > > > > > Yeah, I think as suggested by you, start_address should be maxaligned. > > > > A localized fix would be for datumSerialize to temporarily palloc the > > space for EOH_flatten_into to write into, and then it could memcpy > > that to whatever random address it'd been passed. > > Attached is a patch along these lines, let me know if you have > something else in mind? I have manually verified this with > force_parallel_mode=regress configuration in my development > environment. I don't have access to alignment-sensitive hardware, so > can't test in such an environment. I will see if I can write a test > as well. > Attached patch contains a test case as well to cover this code. It will help us in identifying if there is any failure on alignment-sensitive hardware in this code path. Kindly suggest what is the best path forward, is it a good idea to commit this on HEAD first and see if the test passes on all machines in buildfarm and then back-patch it? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
Amit Kapila <amit.kapila16@gmail.com> writes: > On Sun, Sep 30, 2018 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >> Attached is a patch along these lines, let me know if you have >> something else in mind? I have manually verified this with >> force_parallel_mode=regress configuration in my development >> environment. I don't have access to alignment-sensitive hardware, so >> can't test in such an environment. I will see if I can write a test >> as well. > Attached patch contains a test case as well to cover this code. This test case doesn't actually trigger the problem. There are two reasons: (1) it never invokes datumSerialize with an odd starting address, and (2) the expanded array produced by the plpgsql function contains a "flat" array, allowing EA_flatten_into to go through its "easy" case, which just does a memcpy and so is not sensitive to misalignment. To fix (1) we need a previous parameter of odd length, and to fix (2) we need the plpgsql function to build the array from parts. The cast to a domain is unnecessary and might indeed be a third reason why it doesn't work --- I'd not be very surprised if that results in array flattening. The attached revised patch contains a test case that demonstrably triggers the problem on gaur's host. Oddly, I do not get a crash either on a PPC Mac or a Raspberry Pi 3 running Raspbian. I'm not very sure why; I traced through things with gdb and it's definitely calling EA_flatten_into with an odd address and a non-flattened input. I guess both of those platforms have kernel handlers for misaligned accesses? But the Raspbian box ought to be nearly the same as chipmunk, which is where we saw the problem to begin with, so I'm a bit confused. > Kindly suggest what > is the best path forward, is it a good idea to commit this on HEAD > first and see if the test passes on all machines in buildfarm and then > back-patch it? No, I think you should just commit and back-patch at the same time. I don't have any doubt about the code patch being good, it's just a question of whether the test case reliably triggers the problem. And in the end that's not that important as long as it does so on at least one buildfarm member. regards, tom lane diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index f02a5e7..4957682 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** datumSerialize(Datum value, bool isnull, *** 338,345 **** } else if (eoh) { ! EOH_flatten_into(eoh, (void *) *start_address, header); *start_address += header; } else { --- 338,356 ---- } else if (eoh) { ! char *tmp; ! ! /* ! * EOH_flatten_into expects the target address to be maxaligned, ! * so we can't store directly to *start_address. ! */ ! tmp = (char *) palloc(header); ! EOH_flatten_into(eoh, (void *) tmp, header); ! memcpy(*start_address, tmp, header); *start_address += header; + + /* be tidy. */ + pfree(tmp); } else { diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out index 26409d3..d1f6510 100644 *** a/src/test/regress/expected/select_parallel.out --- b/src/test/regress/expected/select_parallel.out *************** ORDER BY 1, 2, 3; *** 1088,1094 **** ------------------------------+---------------------------+-------------+-------------- (0 rows) ! -- test interation between subquery and partial_paths SET LOCAL min_parallel_table_scan_size TO 0; CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1; EXPLAIN (COSTS OFF) --- 1088,1124 ---- ------------------------------+---------------------------+-------------+-------------- (0 rows) ! -- test passing expanded-value representations to workers ! SAVEPOINT settings; ! SET force_parallel_mode = 1; ! CREATE FUNCTION make_some_array(int,int) returns int[] as ! $$declare x int[]; ! begin ! x[1] := $1; ! x[2] := $2; ! return x; ! end$$ language plpgsql parallel safe; ! CREATE TABLE foo(f1 text, f2 int[], f3 text); ! INSERT INTO foo VALUES('1', ARRAY[1,2], 'one'); ! PREPARE pstmt(text, int[]) AS SELECT * FROM foo WHERE f1 = $1 AND f2 = $2; ! EXPLAIN (COSTS OFF) EXECUTE pstmt('1', make_some_array(1,2)); ! QUERY PLAN ! ------------------------------------------------------------------ ! Gather ! Workers Planned: 3 ! -> Parallel Seq Scan on foo ! Filter: ((f1 = '1'::text) AND (f2 = '{1,2}'::integer[])) ! (4 rows) ! ! EXECUTE pstmt('1', make_some_array(1,2)); ! f1 | f2 | f3 ! ----+-------+----- ! 1 | {1,2} | one ! (1 row) ! ! DEALLOCATE pstmt; ! ROLLBACK TO SAVEPOINT settings; ! -- test interaction between subquery and partial_paths SET LOCAL min_parallel_table_scan_size TO 0; CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1; EXPLAIN (COSTS OFF) diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql index 938c708..326c8f6 100644 *** a/src/test/regress/sql/select_parallel.sql --- b/src/test/regress/sql/select_parallel.sql *************** ORDER BY 1; *** 410,416 **** SELECT * FROM information_schema.foreign_data_wrapper_options ORDER BY 1, 2, 3; ! -- test interation between subquery and partial_paths SET LOCAL min_parallel_table_scan_size TO 0; CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1; EXPLAIN (COSTS OFF) --- 410,436 ---- SELECT * FROM information_schema.foreign_data_wrapper_options ORDER BY 1, 2, 3; ! -- test passing expanded-value representations to workers ! SAVEPOINT settings; ! SET force_parallel_mode = 1; ! ! CREATE FUNCTION make_some_array(int,int) returns int[] as ! $$declare x int[]; ! begin ! x[1] := $1; ! x[2] := $2; ! return x; ! end$$ language plpgsql parallel safe; ! CREATE TABLE foo(f1 text, f2 int[], f3 text); ! INSERT INTO foo VALUES('1', ARRAY[1,2], 'one'); ! ! PREPARE pstmt(text, int[]) AS SELECT * FROM foo WHERE f1 = $1 AND f2 = $2; ! EXPLAIN (COSTS OFF) EXECUTE pstmt('1', make_some_array(1,2)); ! EXECUTE pstmt('1', make_some_array(1,2)); ! DEALLOCATE pstmt; ! ROLLBACK TO SAVEPOINT settings; ! ! -- test interaction between subquery and partial_paths SET LOCAL min_parallel_table_scan_size TO 0; CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1; EXPLAIN (COSTS OFF)
I wrote: > The attached revised patch contains a test case that demonstrably triggers > the problem on gaur's host. Oddly, I do not get a crash either on a PPC > Mac or a Raspberry Pi 3 running Raspbian. I'm not very sure why; I traced > through things with gdb and it's definitely calling EA_flatten_into with > an odd address and a non-flattened input. I guess both of those platforms > have kernel handlers for misaligned accesses? But the Raspbian box ought > to be nearly the same as chipmunk, which is where we saw the problem to > begin with, so I'm a bit confused. Ah: a bit of googling later, the mystery is solved. PPC does have support for unaligned 32-bit accesses, which is as much as EA_flatten_into needs. (It's 64-bit operations where you might have a problem.) My info was also out of date about ARM: more recent processors, at least, can also do unaligned 32-bit accesses. chipmunk either has a pretty old processor or it is configured to disable unaligned access. Apparently the only somewhat-modern architecture that is resolutely unaligned-unfriendly is MIPS. regards, tom lane
Wow, Tom. This is great stuff. Thanks for it. Lou Picciano > On Sep 9, 2018, at 11:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wondered why buildfarm member chipmunk has been failing hard > for the last little while. Fortunately, it's supplying us with > a handy backtrace: > > Program terminated with signal 7, Bus error. > #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 > 329 aresult->dataoffset = dataoffset; > #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 > #1 EA_flatten_into (eohptr=0x188f440, result=0xb55ff30e, allocated_size=<optimized out>) at array_expanded.c:293 > #2 0x003c3dfc in EOH_flatten_into (eohptr=<optimized out>, result=<optimized out>, allocated_size=<optimized out>) atexpandeddatum.c:84 > #3 0x003c076c in datumSerialize (value=3934060, isnull=<optimized out>, typByVal=<optimized out>, typLen=<optimized out>,start_address=0xbea3bd54) at datum.c:341 > #4 0x002a8510 in SerializeParamList (paramLI=0x1889f18, start_address=0xbea3bd54) at params.c:195 > #5 0x002342cc in ExecInitParallelPlan (planstate=0xffffffff, estate=0x18863e0, sendParams=0x46e, nworkers=1, tuples_needed=-1)at execParallel.c:700 > #6 0x002461dc in ExecGather (pstate=0x18864f0) at nodeGather.c:151 > #7 0x00236b20 in ExecProcNodeFirst (node=0x18864f0) at execProcnode.c:445 > #8 0x0022fc2c in ExecProcNode (node=0x18864f0) at ../../../src/include/executor/executor.h:237 > #9 ExecutePlan (execute_once=<optimized out>, dest=0x188a108, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x18864f0, estate=0x18863e0)at execMain.c:1721 > #10 standard_ExecutorRun (queryDesc=0x188a138, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362 > #11 0x0023d630 in postquel_getnext (fcache=0x1888408, es=0x1889d68) at functions.c:867 > #12 fmgr_sql (fcinfo=0x701c7c) at functions.c:1164 > > This is remarkably hard to replicate on other machines, but I eventually > managed to duplicate it on gaur's host, after which it became really > obvious that the parallel-query data transfer logic has never been > stressed very hard on machines with strict data alignment rules. > > In particular, SerializeParamList does this: > > /* Write flags. */ > memcpy(*start_address, &prm->pflags, sizeof(uint16)); > *start_address += sizeof(uint16); > > immediately followed by this: > > datumSerialize(prm->value, prm->isnull, typByVal, typLen, > start_address); > > and datumSerialize might do this: > > EOH_flatten_into(eoh, (void *) *start_address, header); > > Now, I will plead mea culpa that the expanded-object API doesn't > say in large red letters that the target address for EOH_flatten_into > is supposed to be maxaligned. It only says > > * The flattened representation must be a valid in-line, non-compressed, > * 4-byte-header varlena object. > > Still, one might reasonably suspect from that that *at least* 4-byte > alignment is expected. This code path isn't providing such alignment, > and machines that require it will crash. The only reason we've not > noticed, AFAICS, is that nobody has been running with > force_parallel_mode = regress on alignment-picky hardware. > > regards, tom lane >
On Tue, Oct 2, 2018 at 9:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Apparently the only somewhat-modern architecture that is resolutely > unaligned-unfriendly is MIPS. It's been a few years now since I worked on that architecture, but Sparc is somewhat-modern and resolutely unaligned-unfriendly. It's just that you can optionally install a trap handler that will do super slow non-atomic misaligned access in software instead of blowing up with SIGBUS. With the Sun toolchain you did that explicitly by building with -misalign (though it's possible that more recent compilers might be doing that without being asked?). -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > On Tue, Oct 2, 2018 at 9:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Apparently the only somewhat-modern architecture that is resolutely >> unaligned-unfriendly is MIPS. > It's been a few years now since I worked on that architecture, but > Sparc is somewhat-modern and resolutely unaligned-unfriendly. It's > just that you can optionally install a trap handler that will do super > slow non-atomic misaligned access in software instead of blowing up > with SIGBUS. With the Sun toolchain you did that explicitly by > building with -misalign (though it's possible that more recent > compilers might be doing that without being asked?). Interesting ... I suppose we'd have seen that on the Sparc critters, except that they weren't running with force_parallel_mode = regress like chipmunk is. Now I'm tempted to propose that Amit commit *just* the test case and not the fix, and wait a day to see which buildfarm critters fail. regards, tom lane
On Tue, Oct 2, 2018 at 5:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thomas Munro <thomas.munro@enterprisedb.com> writes: > > On Tue, Oct 2, 2018 at 9:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Apparently the only somewhat-modern architecture that is resolutely > >> unaligned-unfriendly is MIPS. > > > It's been a few years now since I worked on that architecture, but > > Sparc is somewhat-modern and resolutely unaligned-unfriendly. It's > > just that you can optionally install a trap handler that will do super > > slow non-atomic misaligned access in software instead of blowing up > > with SIGBUS. With the Sun toolchain you did that explicitly by > > building with -misalign (though it's possible that more recent > > compilers might be doing that without being asked?). > > Interesting ... I suppose we'd have seen that on the Sparc critters, > except that they weren't running with force_parallel_mode = regress > like chipmunk is. > > Now I'm tempted to propose that Amit commit *just* the test case > and not the fix, and wait a day to see which buildfarm critters fail. > Okay, I will take care of that. Are you proposing to commit the test only on HEAD or in back-branches as well? Ideally committing on HEAD would serve our purpose. There is one difference in our tests which I would like to highlight: fix_datum_serialization_v2: +EXPLAIN (COSTS OFF) EXECUTE stmt_sel_domain(make_psafe_ad(1,2)); + QUERY PLAN +-------------------------------------------------------- + Gather + Workers Planned: 1 + Single Copy: true + -> Seq Scan on tarrdomain + Filter: ((c2)::integer[] = '{1,2}'::integer[]) +(5 rows) fix_datum_serialization_v3 ! EXPLAIN (COSTS OFF) EXECUTE pstmt('1', make_some_array(1,2)); ! QUERY PLAN ! ------------------------------------------------------------------ ! Gather ! Workers Planned: 3 ! -> Parallel Seq Scan on foo ! Filter: ((f1 = '1'::text) AND (f2 = '{1,2}'::integer[])) ! (4 rows) I think if we do Analyze on the table after populating rows, it should use just one worker and that should be sufficient to hit the case being discussed. I would like to change the test so that it uses just one worker. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > I think if we do Analyze on the table after populating rows, it should > use just one worker and that should be sufficient to hit the case > being discussed. I would like to change the test so that it uses just > one worker. I thought that adding an ANALYZE would make the test be net slower, not faster; ANALYZE isn't free, even on just a row or so. Also, I believe that coding the test this way makes the leader send the param values to multiple workers, which would flush out any problems with serializing a value multiple times. As against that, there's a hazard that the number of workers might not be stable ... but it seems like we have lots of other occurrences of that same hazard elsewhere in this test script. regards, tom lane
On Tue, Oct 2, 2018 at 7:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > I think if we do Analyze on the table after populating rows, it should > > use just one worker and that should be sufficient to hit the case > > being discussed. I would like to change the test so that it uses just > > one worker. > > I thought that adding an ANALYZE would make the test be net slower, not > faster; ANALYZE isn't free, even on just a row or so. > Hmm, I am curious to know what is your theory behind this? I was under impression that spawning two additional workers would cost more than Analyze. > Also, I believe > that coding the test this way makes the leader send the param values to > multiple workers, which would flush out any problems with serializing a > value multiple times. As against that, there's a hazard that the number > of workers might not be stable ... Yeah, I was actually more worried about instability part, but now I have tested it on both windows and centos machine and the test passes, so I am okay with that. However, I feel if we want to go with that, there is actually no need of statement "SET force_parallel_mode=1". That statement is required to force parallelism on such a small table. It won't harm, but is misleading and in future people will try to copy it in other tests > but it seems like we have lots of > other occurrences of that same hazard elsewhere in this test script. > Right. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > On Tue, Oct 2, 2018 at 7:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Also, I believe >> that coding the test this way makes the leader send the param values to >> multiple workers, which would flush out any problems with serializing a >> value multiple times. As against that, there's a hazard that the number >> of workers might not be stable ... > Yeah, I was actually more worried about instability part, but now I > have tested it on both windows and centos machine and the test passes, > so I am okay with that. However, I feel if we want to go with that, > there is actually no need of statement "SET force_parallel_mode=1". OK, I hadn't tested to see if that could be dropped, but if it can, then we don't need it. The EXPLAIN is enough to ensure that the test is doing what we want. (I think we could drop the savepoint too, no?) regards, tom lane
On Tue, Oct 2, 2018 at 9:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > On Tue, Oct 2, 2018 at 7:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Also, I believe > >> that coding the test this way makes the leader send the param values to > >> multiple workers, which would flush out any problems with serializing a > >> value multiple times. As against that, there's a hazard that the number > >> of workers might not be stable ... > > > Yeah, I was actually more worried about instability part, but now I > > have tested it on both windows and centos machine and the test passes, > > so I am okay with that. However, I feel if we want to go with that, > > there is actually no need of statement "SET force_parallel_mode=1". > > OK, I hadn't tested to see if that could be dropped, but if it can, > then we don't need it. The EXPLAIN is enough to ensure that the > test is doing what we want. > Right. > (I think we could drop the savepoint > too, no?) > One advantage of keeping the savepoint is that we don't need to explicitly drop the objects which we have created temporarily for this test. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > On Tue, Oct 2, 2018 at 9:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (I think we could drop the savepoint >> too, no?) > One advantage of keeping the savepoint is that we don't need to > explicitly drop the objects which we have created temporarily for this > test. They'll go away anyway at the end of the transaction that the whole script is wrapped in. (But it might be worth choosing slightly less generic object names, to avoid a conflict against other sub-tests later in that script.) regards, tom lane
On Tue, Oct 2, 2018 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > On Tue, Oct 2, 2018 at 9:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> (I think we could drop the savepoint > >> too, no?) > > > One advantage of keeping the savepoint is that we don't need to > > explicitly drop the objects which we have created temporarily for this > > test. > > They'll go away anyway at the end of the transaction that the whole > script is wrapped in. That's right, will remove savepoint. > (But it might be worth choosing slightly less > generic object names, to avoid a conflict against other sub-tests > later in that script.) > The function name and statement name seems okay to me. How about changing the table name to fooarr or arrtest? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > On Tue, Oct 2, 2018 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (But it might be worth choosing slightly less >> generic object names, to avoid a conflict against other sub-tests >> later in that script.) > The function name and statement name seems okay to me. How about > changing the table name to fooarr or arrtest? Up to you. regards, tom lane
On Tue, Oct 2, 2018 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > > On Tue, Oct 2, 2018 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> (But it might be worth choosing slightly less > >> generic object names, to avoid a conflict against other sub-tests > >> later in that script.) > > > The function name and statement name seems okay to me. How about > > changing the table name to fooarr or arrtest? > > Up to you. > Okay, I have pushed the test case patch on HEAD. Attached is the code-fix patch, let's wait for a day so that we have all the results which can help us to discuss the merits of this patch. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
On Tue, Oct 2, 2018 at 11:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Oct 2, 2018 at 10:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > > On Tue, Oct 2, 2018 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >> (But it might be worth choosing slightly less > > >> generic object names, to avoid a conflict against other sub-tests > > >> later in that script.) > > > > > The function name and statement name seems okay to me. How about > > > changing the table name to fooarr or arrtest? > > > > Up to you. > > > > Okay, I have pushed the test case patch on HEAD. Attached is the > code-fix patch, let's wait for a day so that we have all the results > which can help us to discuss the merits of this patch. > By now, the added test has failed on gharial [1] with below log on the server: 2018-10-02 00:46:57.019 MDT [5bb31455.2066:193] LOG: statement: CREATE TABLE fooarr(f1 text, f2 int[], f3 text); 2018-10-02 00:46:57.147 MDT [5bb31455.2066:194] LOG: statement: INSERT INTO fooarr VALUES('1', ARRAY[1,2], 'one'); 2018-10-02 00:46:57.149 MDT [5bb31455.2066:195] LOG: statement: PREPARE pstmt(text, int[]) AS SELECT * FROM fooarr WHERE f1 = $1 AND f2 = $2; 2018-10-02 00:46:57.153 MDT [5bb31455.2066:196] LOG: statement: EXPLAIN (COSTS OFF) EXECUTE pstmt('1', make_some_array(1,2)); 2018-10-02 00:46:57.157 MDT [5bb31455.2066:197] LOG: statement: EXECUTE pstmt('1', make_some_array(1,2)); 2018-10-02 00:46:57.157 MDT [5bb31455.2066:198] DETAIL: prepare: PREPARE pstmt(text, int[]) AS SELECT * FROM fooarr WHERE f1 = $1 AND f2 = $2; 2018-10-02 00:46:57.784 MDT [5bb313c8.1da5:4] LOG: server process (PID 8294) was terminated by signal 10 2018-10-02 00:46:57.784 MDT [5bb313c8.1da5:5] DETAIL: Failed process was running: EXECUTE pstmt('1', make_some_array(1,2)); 2018-10-02 00:46:57.784 MDT [5bb313c8.1da5:6] LOG: terminating any other active server processes AFAICS, the machine details are as follows: uname -m = ia64 uname -r = B.11.31 uname -s = HP-UX uname -v = U This seems to indicate that this hardware is alignment-sensitive. I haven't done a detailed study of this architecture, but if anybody is familiar with this architecture, then feel free to share your thoughts. On a quick look at one of on this architecture, there doesn't seem to be any recommendation on aligning the memory access to 8-byte boundary. See 3.2.2 section (Addressable Units and Alignment) in doc [2]. This test doesn't seem to be executed on the chipmunk, so will wait for its results. I will further look into it tomorrow morning. [1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2018-10-02%2006%3A30%3A47 [2] - https://www.csee.umbc.edu/portal/help/architecture/aig.pdf -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: >> Okay, I have pushed the test case patch on HEAD. Attached is the >> code-fix patch, let's wait for a day so that we have all the results >> which can help us to discuss the merits of this patch. > By now, the added test has failed on gharial [1] with below log on the server: Yeah, gharial and anole both don't like this, which is interesting but not really surprising, considering that IA64 is in some part an HPPA follow-on architecture. What I find more interesting is that both of the live Sparc critters are happy --- so despite Thomas' statements upthread, they're coping with unaligned accesses. Maybe you should have back-patched the test to older branches so we could see what castoroides and protosciurus would do. But it's probably not worth additional delay. regards, tom lane
On Wed, Oct 3, 2018 at 7:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: > >> Okay, I have pushed the test case patch on HEAD. Attached is the > >> code-fix patch, let's wait for a day so that we have all the results > >> which can help us to discuss the merits of this patch. > > > By now, the added test has failed on gharial [1] with below log on the server: > > Yeah, gharial and anole both don't like this, which is interesting > but not really surprising, considering that IA64 is in some part > an HPPA follow-on architecture. What I find more interesting is > that both of the live Sparc critters are happy --- so despite > Thomas' statements upthread, they're coping with unaligned accesses. Just for fun: $ curl -O https://people.debian.org/~aurel32/qemu/sparc/debian_etch_sparc_small.qcow2 $ qemu-system-sparc -hda debian_etch_sparc_small.qcow2 -nographic ... Debian GNU/Linux 4.0 debian-sparc ttyS0 debian-sparc login: root Password: root ... debian-sparc:~# cat <<EOF > /etc/apt/sources.list deb http://archive.debian.org/debian/ etch main deb-src http://archive.debian.org/debian/ etch main deb http://archive.debian.org/debian-security/ etch/updates main contrib EOF debian-sparc:~# apt-get update ... debian-sparc:~# apt-get dist-upgrade ... debian-sparc:~# apt-get install gcc libc6-dev ... debian-sparc:~# cat <<EOF > test.c int main(int argc, char **argv) { int i[2]; int *p; p = (int *)(((char *) &i[0]) + 2); *p = 42; } EOF debian-sparc:~# gcc test.c debian-sparc:~# ./a.out Bus error -- Thomas Munro http://www.enterprisedb.com
On Wed, Oct 3, 2018 at 12:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapila16@gmail.com> writes: > >> Okay, I have pushed the test case patch on HEAD. Attached is the > >> code-fix patch, let's wait for a day so that we have all the results > >> which can help us to discuss the merits of this patch. > > > By now, the added test has failed on gharial [1] with below log on the server: > > Yeah, gharial and anole both don't like this, which is interesting > but not really surprising, considering that IA64 is in some part > an HPPA follow-on architecture. > Now chipmunk also failed for the same test. > What I find more interesting is > that both of the live Sparc critters are happy --- so despite > Thomas' statements upthread, they're coping with unaligned accesses. > Maybe you should have back-patched the test to older branches so > we could see what castoroides and protosciurus would do. But it's > probably not worth additional delay. > Agreed, I will push the code-fix on HEAD and code+test in back-branches. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Wed, Oct 3, 2018 at 8:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Oct 3, 2018 at 12:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Amit Kapila <amit.kapila16@gmail.com> writes: > > >> Okay, I have pushed the test case patch on HEAD. Attached is the > > >> code-fix patch, let's wait for a day so that we have all the results > > >> which can help us to discuss the merits of this patch. > > > > > By now, the added test has failed on gharial [1] with below log on the server: > > > > Yeah, gharial and anole both don't like this, which is interesting > > but not really surprising, considering that IA64 is in some part > > an HPPA follow-on architecture. > > > > Now chipmunk also failed for the same test. > > > What I find more interesting is > > that both of the live Sparc critters are happy --- so despite > > Thomas' statements upthread, they're coping with unaligned accesses. > > Maybe you should have back-patched the test to older branches so > > we could see what castoroides and protosciurus would do. But it's > > probably not worth additional delay. > > > > Agreed, I will push the code-fix on HEAD and code+test in back-branches. > Pushed, let's wait and see if this can make all the failing buidfarm members (due to this issue) happy. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Wed, Oct 3, 2018 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Now chipmunk also failed for the same test. > > > > > What I find more interesting is > > > that both of the live Sparc critters are happy --- so despite > > > Thomas' statements upthread, they're coping with unaligned accesses. > > > Maybe you should have back-patched the test to older branches so > > > we could see what castoroides and protosciurus would do. But it's > > > probably not worth additional delay. > > > > > > > Agreed, I will push the code-fix on HEAD and code+test in back-branches. > > > > Pushed, let's wait and see if this can make all the failing buidfarm > members (due to this issue) happy. > All the affected members (gharial, chipmunk, anole) are happy. It feels good to see chipmunk becoming green after so many days. Thanks a lot, Tom for your assistance. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > All the affected members (gharial, chipmunk, anole) are happy. It > feels good to see chipmunk becoming green after so many days. Yup. I've marked this item fixed on the open-items list. regards, tom lane