Thread: Patch to fix write after end of array in hashed agg initialization

Patch to fix write after end of array in hashed agg initialization

From

Colm McHugh

Date:

22 May 2019, 01:03:46

Attached is a patch for a write after allocated memory which we found in testing. Its an obscure case but can happen if the same column is used in different grouping keys, as in the example below, which uses tables from the regress test suite (build with --enable-cassert in order to turn on memory warnings). Patch is against master.

The hashed aggregate state has an array for the column indices that is sized using the number of non-aggregated columns in the set that includes the agg's targetlist, quals and input grouping columns. The duplicate elimination of columns can result in under-allocation, as below. Sizing based on the number of grouping columns and number of quals/targetlists not in the grouping columns avoids this.

Regards,

Colm McHugh (Salesforce)

explain (costs off) select 1 from tenk where (hundred, thousand) in (select twothousand, twothousand from onek);

psql: WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x7f8b8901fa00, chunk 0x7f8b89020cd0

QUERY PLAN

-------------------------------------------------------------

Hash Join

Hash Cond: (tenk.hundred = onek.twothousand)

-> Seq Scan on tenk

Filter: (hundred = thousand)

-> Hash

-> HashAggregate

Group Key: onek.twothousand, onek.twothousand

-> Seq Scan on onek

(8 rows)

Attachment

0001-Fix-write-after-end-of-array-in-hashed-Agg-initializ.patch

Re: Patch to fix write after end of array in hashed agg initialization

From

Tom Lane

Date:

22 May 2019, 15:11:20

Colm McHugh <colm.mchugh@gmail.com> writes:
> Attached is a patch for a write after allocated memory which we found in
> testing. Its an obscure case but can happen if the same column is used in
> different grouping keys, as in the example below, which uses tables from
> the regress test suite (build with --enable-cassert in order to turn on
> memory warnings). Patch is against master.

I confirm the appearance of the memory-overwrite warnings in HEAD.

It looks like the bad code is (mostly) the fault of commit b5635948.
Andrew, can you take a look at this fix?

            regards, tom lane

Re: Patch to fix write after end of array in hashed agg initialization

From

Andrew Gierth

Date:

23 May 2019, 00:36:10

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> Attached is a patch for a write after allocated memory which we
 >> found in testing. Its an obscure case but can happen if the same
 >> column is used in different grouping keys, as in the example below,
 >> which uses tables from the regress test suite (build with
 >> --enable-cassert in order to turn on memory warnings). Patch is
 >> against master.

 Tom> I confirm the appearance of the memory-overwrite warnings in HEAD.

 Tom> It looks like the bad code is (mostly) the fault of commit
 Tom> b5635948. Andrew, can you take a look at this fix?

I'll look into it.

-- 
Andrew (irc:RhodiumToad)

Re: Patch to fix write after end of array in hashed agg initialization

From

Andrew Gierth

Date:

23 May 2019, 03:49:57

>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 >>> Attached is a patch for a write after allocated memory which we
 >>> found in testing. Its an obscure case but can happen if the same
 >>> column is used in different grouping keys, as in the example below,
 >>> which uses tables from the regress test suite (build with
 >>> --enable-cassert in order to turn on memory warnings). Patch is
 >>> against master.

 Andrew> I'll look into it.

OK, so my first impression is that this is down to (a) the fact that
when planning a GROUP BY, we eliminate duplicate grouping keys; (b) due
to (a), the executor code isn't expecting to have to deal with
duplicates, but (c) when using a HashAgg to implement a Unique path, the
planner code isn't making any attempt to eliminate duplicates so they
get through.

It was wrong before commit b5635948, looks like Andres' fc4b3dea2 which
introduced the arrays and the concept of narrowing the stored tuples is
the actual culprit. But I'll deal with fixing it anyway unless Andres
has a burning desire to step in.

My inclination is to fix this in the planner rather than the executor;
there seems no good reason to actually hash a duplicate column more than
once.

-- 
Andrew (irc:RhodiumToad)

Re: Patch to fix write after end of array in hashed agg initialization

From

Tom Lane

Date:

23 May 2019, 04:02:36

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> My inclination is to fix this in the planner rather than the executor;
> there seems no good reason to actually hash a duplicate column more than
> once.

Sounds reasonable --- but would it make sense to introduce some
assertions, or other cheap tests, into the executor to check that
it's not being given a case it can't handle?

            regards, tom lane

Re: Patch to fix write after end of array in hashed agg initialization

From

Andrew Gierth

Date:

23 May 2019, 04:11:57

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 > Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
 >> My inclination is to fix this in the planner rather than the
 >> executor; there seems no good reason to actually hash a duplicate
 >> column more than once.

 Tom> Sounds reasonable --- but would it make sense to introduce some
 Tom> assertions, or other cheap tests, into the executor to check that
 Tom> it's not being given a case it can't handle?

Oh definitely, I was planning on it.

-- 
Andrew (irc:RhodiumToad)

Re: Patch to fix write after end of array in hashed agg initialization

From

Andrew Gierth

Date:

23 May 2019, 10:44:20

>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 Andrew> My inclination is to fix this in the planner rather than the
 Andrew> executor; there seems no good reason to actually hash a
 Andrew> duplicate column more than once.

I take this back; I don't believe it's possible to eliminate duplicates
in all cases. Consider (a,b) IN (select c,c from...), where a,b,c are
different types; I don't think we can assume that (a=c) and (b=c)
cross-type comparisons will necessarily induce the same hash function on
c, and so we might legitimately need to keep it duplicated.

So I'm going with a simpler method of ensuring the array is adequately
sized at execution time and not touching the planner at all. Draft patch
is attached, will commit it later.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 5b4a602952..6b8ef40599 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -1312,9 +1312,14 @@ build_hash_table(AggState *aggstate)
  * by themselves, and secondly ctids for row-marks.
  *
  * To eliminate duplicates, we build a bitmapset of the needed columns, and
- * then build an array of the columns included in the hashtable.  Note that
- * the array is preserved over ExecReScanAgg, so we allocate it in the
- * per-query context (unlike the hash table itself).
+ * then build an array of the columns included in the hashtable. We might
+ * still have duplicates if the passed-in grpColIdx has them, which can happen
+ * in edge cases from semijoins/distinct; these can't always be removed,
+ * because it's not certain that the duplicate cols will be using the same
+ * hash function.
+ *
+ * Note that the array is preserved over ExecReScanAgg, so we allocate it in
+ * the per-query context (unlike the hash table itself).
  */
 static void
 find_hash_columns(AggState *aggstate)
@@ -1335,6 +1340,7 @@ find_hash_columns(AggState *aggstate)
         AttrNumber *grpColIdx = perhash->aggnode->grpColIdx;
         List       *hashTlist = NIL;
         TupleDesc    hashDesc;
+        int            maxCols;
         int            i;
 
         perhash->largestGrpColIdx = 0;
@@ -1359,15 +1365,24 @@ find_hash_columns(AggState *aggstate)
                     colnos = bms_del_member(colnos, attnum);
             }
         }
-        /* Add in all the grouping columns */
-        for (i = 0; i < perhash->numCols; i++)
-            colnos = bms_add_member(colnos, grpColIdx[i]);
+
+        /*
+         * Compute maximum number of input columns accounting for possible
+         * duplications in the grpColIdx array, which can happen in some edge
+         * cases where HashAggregate was generated as part of a semijoin or a
+         * DISTINCT.
+         */
+        maxCols = bms_num_members(colnos) + perhash->numCols;
 
         perhash->hashGrpColIdxInput =
-            palloc(bms_num_members(colnos) * sizeof(AttrNumber));
+            palloc(maxCols * sizeof(AttrNumber));
         perhash->hashGrpColIdxHash =
             palloc(perhash->numCols * sizeof(AttrNumber));
 
+        /* Add all the grouping columns to colnos */
+        for (i = 0; i < perhash->numCols; i++)
+            colnos = bms_add_member(colnos, grpColIdx[i]);
+
         /*
          * First build mapping for columns directly hashed. These are the
          * first, because they'll be accessed when computing hash values and
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index fed69dc9e1..2e5ce8cc32 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2270,3 +2270,21 @@ select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
  ba       |    0 |     1
 (2 rows)
 
+-- Make sure that generation of HashAggregate for uniqification purposes
+-- does not lead to array overflow due to unexpected duplicate hash keys
+-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
+explain (costs off)
+  select 1 from tenk1
+   where (hundred, thousand) in (select twothousand, twothousand from onek);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Hash Join
+   Hash Cond: (tenk1.hundred = onek.twothousand)
+   ->  Seq Scan on tenk1
+         Filter: (hundred = thousand)
+   ->  Hash
+         ->  HashAggregate
+               Group Key: onek.twothousand, onek.twothousand
+               ->  Seq Scan on onek
+(8 rows)
+
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 230f44666a..ca0d5e66b2 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -988,3 +988,10 @@ select v||'a', case v||'a' when 'aa' then 1 else 0 end, count(*)
 select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
   from unnest(array['a','b']) u(v)
  group by v||'a' order by 1;
+
+-- Make sure that generation of HashAggregate for uniqification purposes
+-- does not lead to array overflow due to unexpected duplicate hash keys
+-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
+explain (costs off)
+  select 1 from tenk1
+   where (hundred, thousand) in (select twothousand, twothousand from onek);