Thread: BUG #16389: OOM on CTE since version 12

BUG #16389: OOM on CTE since version 12

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16389
Logged by:          Rikard Pavelic
Email address:      rikard@ngs.hr
PostgreSQL version: 12.2
Operating system:   Centos 7, Linux 3.10
Description:

Hi, we are testing upgrade from 11 to 12 and encountered some strange
OOMs.
This one was consistently happening on several instances, while it passed
normally on many other ones:

org.postgresql.util.PSQLException:ERROR: out of memory
  Detail: Failed on request of size 52 in memory context "ExecutorState".
Where: "WITH source AS (
    SELECT 
        i,
        row_number() over () as ord,
        (t.old).amount_array AS old,
        (t.new).amount_array AS new,
        coalesce(array_upper((t.old).amount_array, 1), 0) AS old_count,
        coalesce(array_upper((t.new).amount_array, 1), 0) AS new_count,
        (SELECT array_agg(row(_r.pk1, _r.pk2)) FROM unnest((t.old).amount_array)
_r) as old_pks,
        (SELECT array_agg(row(_r.pk1, _r.pk2)) FROM unnest((t.new).amount_array)
_r) as new_pks,
        (SELECT array_agg(i) FROM generate_series(1, CASE WHEN
coalesce(array_upper((t.old).amount_array, 1), 0) >
coalesce(array_upper((t.new).amount_array, 1), 0) THEN
array_upper((t.old).amount_array, 1) ELSE array_upper((t.new).amount_array,
1) END) i) as indexes
    FROM global_temp_input_table t
    WHERE 
        NOT (t.old).amount_array IS NULL AND (t.new).amount_array IS NULL
        OR (t.old).amount_array IS NULL AND NOT (t.new).amount_array IS NULL
        OR NOT (t.old).amount_array IS NULL AND NOT (t.new).amount_array IS NULL
AND (t.old).amount_array != (t.new).amount_array
)
INSERT INTO global_temp_update
SELECT i, index, old[index] AS old, 
    CASE 
        WHEN old_pks[index] = new_pks[index] THEN new[index] 
        WHEN NOT COALESCE(old_pks[index] = ANY (new_pks), false) THEN null
        ELSE (select n from unnest(new) n where n.pk1 = old[index].pk1 AND n.pk2 =
old[index].pk2)
    END AS changed,
    new[index] AS new,
    CASE WHEN old_pks[index] = new_pks[index] THEN false ELSE NOT
new_pks[index] IS NULL AND NOT COALESCE(new_pks[index] = ANY (old_pks),
false) END AS is_new
FROM
(
    SELECT i as _i, ord as _ord, unnest((SELECT array_agg(i) FROM
generate_series(1, CASE WHEN old_count > new_count THEN old_count ELSE
new_count END) i)) as index
    FROM source s
) ix
INNER JOIN source sq ON ix._i = sq.i AND ix._ord = sq.ord"

The global input temp tables look like this

CREATE UNLOGGED TABLE global_temp_input_table
(
    i integer,
    old data_source_view,
    new data_source_view
);

where data_source_view is an view with an array of composites (amount_array)
which gets unnested and compared here.

While the query is not optimal, the OOMs stopped after we put MATERIALIZED
on top (which does not seem necessary as it's used twice).
Anyway, it's not an immediate problem for us anymore, as we've rewritten it
into a saner form,
but it hints at some underlying issue with recent CTE optimization.

Regards,
Rikard


Re: BUG #16389: OOM on CTE since version 12

From
Tomas Vondra
Date:
Hi Rikard,

On Sat, Apr 25, 2020 at 07:08:06AM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference:      16389
>Logged by:          Rikard Pavelic
>Email address:      rikard@ngs.hr
>PostgreSQL version: 12.2
>Operating system:   Centos 7, Linux 3.10
>Description:
>
>Hi, we are testing upgrade from 11 to 12 and encountered some strange
>OOMs.
>This one was consistently happening on several instances, while it passed
>normally on many other ones:
>
>org.postgresql.util.PSQLException:ERROR: out of memory
>  Detail: Failed on request of size 52 in memory context "ExecutorState".
>Where: "WITH source AS (
>    SELECT
>        i,
>        row_number() over () as ord,
>        (t.old).amount_array AS old,
>        (t.new).amount_array AS new,
>        coalesce(array_upper((t.old).amount_array, 1), 0) AS old_count,
>        coalesce(array_upper((t.new).amount_array, 1), 0) AS new_count,
>        (SELECT array_agg(row(_r.pk1, _r.pk2)) FROM unnest((t.old).amount_array)
>_r) as old_pks,
>        (SELECT array_agg(row(_r.pk1, _r.pk2)) FROM unnest((t.new).amount_array)
>_r) as new_pks,
>        (SELECT array_agg(i) FROM generate_series(1, CASE WHEN
>coalesce(array_upper((t.old).amount_array, 1), 0) >
>coalesce(array_upper((t.new).amount_array, 1), 0) THEN
>array_upper((t.old).amount_array, 1) ELSE array_upper((t.new).amount_array,
>1) END) i) as indexes
>    FROM global_temp_input_table t
>    WHERE
>        NOT (t.old).amount_array IS NULL AND (t.new).amount_array IS NULL
>        OR (t.old).amount_array IS NULL AND NOT (t.new).amount_array IS NULL
>        OR NOT (t.old).amount_array IS NULL AND NOT (t.new).amount_array IS NULL
>AND (t.old).amount_array != (t.new).amount_array
>)
>INSERT INTO global_temp_update
>SELECT i, index, old[index] AS old,
>    CASE
>        WHEN old_pks[index] = new_pks[index] THEN new[index]
>        WHEN NOT COALESCE(old_pks[index] = ANY (new_pks), false) THEN null
>        ELSE (select n from unnest(new) n where n.pk1 = old[index].pk1 AND n.pk2 =
>old[index].pk2)
>    END AS changed,
>    new[index] AS new,
>    CASE WHEN old_pks[index] = new_pks[index] THEN false ELSE NOT
>new_pks[index] IS NULL AND NOT COALESCE(new_pks[index] = ANY (old_pks),
>false) END AS is_new
>FROM
>(
>    SELECT i as _i, ord as _ord, unnest((SELECT array_agg(i) FROM
>generate_series(1, CASE WHEN old_count > new_count THEN old_count ELSE
>new_count END) i)) as index
>    FROM source s
>) ix
>INNER JOIN source sq ON ix._i = sq.i AND ix._ord = sq.ord"
>
>The global input temp tables look like this
>
>CREATE UNLOGGED TABLE global_temp_input_table
>(
>    i integer,
>    old data_source_view,
>    new data_source_view
>);
>
>where data_source_view is an view with an array of composites (amount_array)
>which gets unnested and compared here.
>
>While the query is not optimal, the OOMs stopped after we put MATERIALIZED
>on top (which does not seem necessary as it's used twice).
>Anyway, it's not an immediate problem for us anymore, as we've rewritten it
>into a saner form,
>but it hints at some underlying issue with recent CTE optimization.
>

It's difficult to say what went wrong based just on the query text, but
seeing that you're using composites as columns, I wouldn't be surprised
if this was a case of poor estimates. I'd bet we'll only build stats
(MCV, histograms, ...) for the whole composite value, not for individual
columns, which would mean estimates for the various conditions may be
arbitrarily off. Which can easily lead to plans using e.g. hash
aggregate and OOM. It's possible some of the instances were "lucky" not
to trigger this, e.g. due to slightly different data.


It'd be good to see explains for both the original and modified query,
and explain analyze for both (or at least for the modified one, if the
older one still ends with OOM).

Another thing you might do is grabbing memory context stats from the
older query, before it fails with OOM. That'll tell us which part of the
query allocates the memory. You'll need to use gdb or some other
debugger to do this, about like this:

     -- in database
     select pg_backend_pid();
     ... run query ...

     -- in shell (wait until the query allocates enough memory)
     gdb -p $the_backend_pid
     (gdb) p MemoryContextStats(TopMemoryContext)


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services