Re: SerializeParamList vs machines with strict alignment - Mailing list pgsql-hackers

From Tom Lane
Subject Re: SerializeParamList vs machines with strict alignment
Date
Msg-id 27748.1538425312@sss.pgh.pa.us
Whole thread Raw
In response to Re: SerializeParamList vs machines with strict alignment  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: SerializeParamList vs machines with strict alignment  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Sun, Sep 30, 2018 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Attached is a patch along these lines, let me know if you have
>> something else in mind?  I have manually verified this with
>> force_parallel_mode=regress configuration in my development
>> environment.  I don't have access to alignment-sensitive hardware, so
>> can't test in such an environment.  I will see if I can write a test
>> as well.

> Attached patch contains a test case as well to cover this code.

This test case doesn't actually trigger the problem.  There are two
reasons: (1) it never invokes datumSerialize with an odd starting
address, and (2) the expanded array produced by the plpgsql function
contains a "flat" array, allowing EA_flatten_into to go through its
"easy" case, which just does a memcpy and so is not sensitive to
misalignment.  To fix (1) we need a previous parameter of odd length,
and to fix (2) we need the plpgsql function to build the array from
parts.  The cast to a domain is unnecessary and might indeed be a
third reason why it doesn't work --- I'd not be very surprised if
that results in array flattening.

The attached revised patch contains a test case that demonstrably triggers
the problem on gaur's host.  Oddly, I do not get a crash either on a PPC
Mac or a Raspberry Pi 3 running Raspbian.  I'm not very sure why; I traced
through things with gdb and it's definitely calling EA_flatten_into with
an odd address and a non-flattened input.  I guess both of those platforms
have kernel handlers for misaligned accesses?  But the Raspbian box ought
to be nearly the same as chipmunk, which is where we saw the problem to
begin with, so I'm a bit confused.

> Kindly suggest what
> is the best path forward, is it a good idea to commit this on HEAD
> first and see if the test passes on all machines in buildfarm and then
> back-patch it?

No, I think you should just commit and back-patch at the same time.
I don't have any doubt about the code patch being good, it's just
a question of whether the test case reliably triggers the problem.
And in the end that's not that important as long as it does so on
at least one buildfarm member.

            regards, tom lane

diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c
index f02a5e7..4957682 100644
*** a/src/backend/utils/adt/datum.c
--- b/src/backend/utils/adt/datum.c
*************** datumSerialize(Datum value, bool isnull,
*** 338,345 ****
          }
          else if (eoh)
          {
!             EOH_flatten_into(eoh, (void *) *start_address, header);
              *start_address += header;
          }
          else
          {
--- 338,356 ----
          }
          else if (eoh)
          {
!             char       *tmp;
!
!             /*
!              * EOH_flatten_into expects the target address to be maxaligned,
!              * so we can't store directly to *start_address.
!              */
!             tmp = (char *) palloc(header);
!             EOH_flatten_into(eoh, (void *) tmp, header);
!             memcpy(*start_address, tmp, header);
              *start_address += header;
+
+             /* be tidy. */
+             pfree(tmp);
          }
          else
          {
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 26409d3..d1f6510 100644
*** a/src/test/regress/expected/select_parallel.out
--- b/src/test/regress/expected/select_parallel.out
*************** ORDER BY 1, 2, 3;
*** 1088,1094 ****
  ------------------------------+---------------------------+-------------+--------------
  (0 rows)

! -- test interation between subquery and partial_paths
  SET LOCAL min_parallel_table_scan_size TO 0;
  CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1;
  EXPLAIN (COSTS OFF)
--- 1088,1124 ----
  ------------------------------+---------------------------+-------------+--------------
  (0 rows)

! -- test passing expanded-value representations to workers
! SAVEPOINT settings;
! SET force_parallel_mode = 1;
! CREATE FUNCTION make_some_array(int,int) returns int[] as
! $$declare x int[];
!   begin
!     x[1] := $1;
!     x[2] := $2;
!     return x;
!   end$$ language plpgsql parallel safe;
! CREATE TABLE foo(f1 text, f2 int[], f3 text);
! INSERT INTO foo VALUES('1', ARRAY[1,2], 'one');
! PREPARE pstmt(text, int[]) AS SELECT * FROM foo WHERE f1 = $1 AND f2 = $2;
! EXPLAIN (COSTS OFF) EXECUTE pstmt('1', make_some_array(1,2));
!                             QUERY PLAN
! ------------------------------------------------------------------
!  Gather
!    Workers Planned: 3
!    ->  Parallel Seq Scan on foo
!          Filter: ((f1 = '1'::text) AND (f2 = '{1,2}'::integer[]))
! (4 rows)
!
! EXECUTE pstmt('1', make_some_array(1,2));
!  f1 |  f2   | f3
! ----+-------+-----
!  1  | {1,2} | one
! (1 row)
!
! DEALLOCATE pstmt;
! ROLLBACK TO SAVEPOINT settings;
! -- test interaction between subquery and partial_paths
  SET LOCAL min_parallel_table_scan_size TO 0;
  CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1;
  EXPLAIN (COSTS OFF)
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 938c708..326c8f6 100644
*** a/src/test/regress/sql/select_parallel.sql
--- b/src/test/regress/sql/select_parallel.sql
*************** ORDER BY 1;
*** 410,416 ****
  SELECT * FROM information_schema.foreign_data_wrapper_options
  ORDER BY 1, 2, 3;

! -- test interation between subquery and partial_paths
  SET LOCAL min_parallel_table_scan_size TO 0;
  CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1;
  EXPLAIN (COSTS OFF)
--- 410,436 ----
  SELECT * FROM information_schema.foreign_data_wrapper_options
  ORDER BY 1, 2, 3;

! -- test passing expanded-value representations to workers
! SAVEPOINT settings;
! SET force_parallel_mode = 1;
!
! CREATE FUNCTION make_some_array(int,int) returns int[] as
! $$declare x int[];
!   begin
!     x[1] := $1;
!     x[2] := $2;
!     return x;
!   end$$ language plpgsql parallel safe;
! CREATE TABLE foo(f1 text, f2 int[], f3 text);
! INSERT INTO foo VALUES('1', ARRAY[1,2], 'one');
!
! PREPARE pstmt(text, int[]) AS SELECT * FROM foo WHERE f1 = $1 AND f2 = $2;
! EXPLAIN (COSTS OFF) EXECUTE pstmt('1', make_some_array(1,2));
! EXECUTE pstmt('1', make_some_array(1,2));
! DEALLOCATE pstmt;
! ROLLBACK TO SAVEPOINT settings;
!
! -- test interaction between subquery and partial_paths
  SET LOCAL min_parallel_table_scan_size TO 0;
  CREATE VIEW tenk1_vw_sec WITH (security_barrier) AS SELECT * FROM tenk1;
  EXPLAIN (COSTS OFF)

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: settings to control SSL/TLS protocol version
Next
From: Tom Lane
Date:
Subject: Re: has_column_privilege behavior (was Re: Assert failed in snprintf.c)