Thread: BUG #6763: Severe memory leak with arrays and hstore

BUG #6763: Severe memory leak with arrays and hstore

From
karavelov@mail.bg
Date:
The following bug has been logged on the website:

Bug reference:      6763
Logged by:          Luben Karavelov
Email address:      karavelov@mail.bg
PostgreSQL version: 9.1.4
Operating system:   Debian Linux
Description:=20=20=20=20=20=20=20=20

I was trying to migrate a big ER table (user preferences) into a new table.

Old able:
old_prefs (
    user_id integer NOT NULL,
    name    varchar NOT NULL,
    value   varchar NOT NULL
);

New table:
preferences (
    user_id integer PRIMARY KEY,
    prefs   hstore;
);
The query I have tried to use is:

INSERT INTO preferences=20
SELECT user_id,hstore(array_agg(name), array_agg(value))=20
FROM old_prefs=20
GROUP BY user_id;

But the postgres backend consumed all the available memory (6G free + 4G
swap) and finally was killed by the kernel.

Its 8G RAM machine and here are the memory options from the config:

shared_buffers =3D 1800MB
temp_buffers =3D 16MB
work_mem =3D 64MB
maintenance_work_mem =3D 256MB
max_stack_depth =3D 2MB

Finally I have managed to migrate it in batches of 100-200k user ids and
disconnecting after each query in order to free the backend and leaked
memory.

Best regards
Luben

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Tom Lane
Date:
karavelov@mail.bg writes:
> The query I have tried to use is:

> INSERT INTO preferences
> SELECT user_id,hstore(array_agg(name), array_agg(value))
> FROM old_prefs
> GROUP BY user_id;

> But the postgres backend consumed all the available memory (6G free + 4G
> swap) and finally was killed by the kernel.

Is there any reason to think this is an actual leak, and not just "you
were trying to compute an impractically large value"?

            regards, tom lane

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
On 07/26/2012 09:32 AM, karavelov@mail.bg wrote:
> Finally I have managed to migrate it in batches of 100-200k user ids and
> disconnecting after each query in order to free the backend and leaked
> memory.
If you do it in batches, but you do NOT disconnect and reconnect, does
the backend continue to grow?

What's the output of:

SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM (
SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id)
AS sub;

and

SELECT pg_size_pretty(pg_total_relation_size('old_prefs'));

?

--
Craig Ringer

Re: BUG #6763: Severe memory leak with arrays and hstore

From
karavelov@mail.bg
Date:
----- Craig Ringer (ringerc@ringerc.id.au), =D0=BD=D0=B0 26.07.2012 =D0=B2 =
11:17 -----
> On 07/26/2012 09:32 AM, karavelov@mail.bg wrote: >> Finally I have
managed to migrate it in batches of 100-200k user ids and >> disconnecting
after each query in order to free the backend and leaked >> memory. > If
you do it in batches, but you do NOT disconnect and reconnect, does > the
backend continue to grow? > > What's the output of: > > SELECT
count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM ( > SELECT
user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) > AS sub;
> > and > > SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); > >
? > > -- > Craig Ringer > Ok, I will do the procedure again with taking
notes on each step. First, here are the results of the queries you asked:
pg=3D> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM
( SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id)
AS sub; count | to_char ---------+----------- 1257262 | 2.26 (1 row) pg=3D>
SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); pg_size_pretty
---------------- 264 MB (1 row) pg=3D> d old_prefs Table "public.old_prefs"
Column | Type | Modifiers ---------+-------------------+----------- user_id
| integer | not null name | character varying | not null value | character
varying | not null Indexes: "old_prefs_user_id_ids" btree (user_id) Also
there are max of 34 rows per user_id in old_prefs here is the new table I
just created: pg=3D> d new_preferences Table "public.new_preferences" Column
| Type | Modifiers ---------+---------+----------- user_id | integer | not
null prefs | hstore | Indexes: "new_preferences_pkey" PRIMARY KEY, btree
(user_id) Foreign-key constraints: "new_preferences_user_id_fkey" FOREIGN
KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE Here is a newly
connected the backend: root@pg:/var/log# ps axu | egrep '10.0.2.71|USER' |
grep -v grep USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.0 0.0 2266944 3448 ? Ss 15:23 0:00 postgres: pg pg
10.0.2.71(51734) idle Migrating the first 200k of the users to the new
scheme: pg=3D> select count(*) from old_prefs where user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id commit; COMMIT Here is the backend: USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.8 7.1
3081772 582712 ? Ss 15:23 0:02 postgres: pg pg 10.0.2.71(51734) idle
Migrating another batch of users: pg =3D> select count(*) from old_prefs
where user_id>=3D200000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3D200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 1.1 8.5 3176164 697444 ? Ss 15:23
0:05 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=3D> select
count(*) from old_prefs where user_id>=3D600000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=3D600000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.7 9.6
3210224 791404 ? Ss 15:23 0:08 postgres: pg pg 10.0.2.71(51734) idle
Another batch: pg=3D> select count(*) from old_prefs where user_id>=3D11000=
00
and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3D1100000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 0.9 10.8 3277412 889860 ? Ss 15:23
0:11 postgres: pg pg 10.0.2.71(51734) idle So Pg backeng keep growing with
100M per 200k row from old table that became 50-60k rows in the new table
Proceeding with another batch: pg=3D> select count(*) from old_prefs where
user_id>=3D1600000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3D1600000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 0.9 11.5 3277412 945560 ? Ss 15:23
0:15 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=3D> select
count(*) from old_prefs where user_id>=3D2400000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=3D2400000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.2 16.2
3736968 1331796 ? Ss 15:23 0:20 postgres: pg pg 10.0.2.71(51734) idle
Another batch: pg =3D> select count(*) from old_prefs where user_id>=3D3400=
000
and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3D3400000 AND user_id rollback; ROLLBACK Ops.. have to cleanup the
old_prefs, some users were deleted in the meantime: pg=3D> delete from
old_prefs where user_id not in (select user_id from users); DELETE 7 pg=3D>
commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.4 26.4 4469520 2157588 ? Ss 15:23 0:29 postgres: pg pg
10.0.2.71(51734) idle Near 1G grow on rolled back transaction.... pg=3D>
INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name),
array_agg(value)) FROM old_prefs WHERE user_id>=3D3400000 AND user_id commi=
t;
COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
19121 1.7 26.6 4479944 2180536 ? Ss 15:23 0:35 postgres: pg pg
10.0.2.71(51734) idle Another batch, bigger this time: pg=3D> select count(=
*)
from old_prefs where user_id>=3D3800000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=3D3800000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.9 33.1
5238968 2710756 ? Ss 15:23 0:45 postgres: pg pg 10.0.2.71(51734) idle
Another big batch: pg=3D> select count(*) from old_prefs where
user_id>=3D4200000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3D4200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 2.2 35.7 5438412 2918720 ? Ss 15:23
0:55 postgres: pg pg 10.0.2.71(51734) idle Now a smaller batch: pg=3D> sele=
ct
count(*) from old_prefs where user_id>=3D4400000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=3D4400000 AND user_id commit; COMMIT RSS keeps
growing: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
19121 2.2 36.0 5438412 2943784 ? Ss 15:23 1:00 postgres: pg pg
10.0.2.71(51734) idle Lets see if a bigger batch will pass: pg=3D> select
count(*) from old_prefs where user_id>=3D4500000; count -------- 631911 (1
row) pg=3D> INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3D4500000 GROUP BY user_id; INSERT 0 296541 pg=3D> commit; COMMIT =
Ok,
this time it passed, but the backend is over 4G USER PID %CPU %MEM VSZ RSS
TTY STAT START TIME COMMAND postgres 19121 2.2 50.0 7227968 4088928 ? Ss
15:23 1:17 postgres: pg pg 10.0.2.71(51734) idle Some observations: 1.
Backend does not free allocated memory between transactions. 2. Rolled back
transactions also leak memory. 3. Leaked memory is not linear to work done
- 2 transactions with 200k keys will leak less than 1 transaction with 400k
keys Regarding Tom's question: The old_prefs does not fit in work_mem but
is quite small regarding the total RAM. Isn't the "work_mem" a limit of the
memory each backend could allocate for sorting, grouping and aggregation?
My understanding is that bigger allocation will overflow to disk and will
not kill the server. I could be wrong though. Thanks in advance and best
regards -- Luben Karavelov=

Re: BUG #6763: Severe memory leak with arrays and hstore

From
luben karavelov
Date:
On Jul 26, 2012, at 11:17 AM, Craig Ringer wrote:

> On 07/26/2012 09:32 AM, karavelov@mail.bg wrote:
>> Finally I have managed to migrate it in batches of 100-200k user ids and
>> disconnecting after each query in order to free the backend and leaked
>> memory.
> If you do it in batches, but you do NOT disconnect and reconnect, does th=
e backend continue to grow?
>=20
> What's the output of:
>=20
> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM (
> SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) A=
S sub;
>=20
> and
>=20
> SELECT pg_size_pretty(pg_total_relation_size('old_prefs'));
>=20
> ?
>=20
> --
> Craig Ringer
>=20

-
Sorry for the broken formatting in the last message. Here it is again:

Ok, I will do the procedure again with taking notes on each step.

First, here are the results of the queries you asked:

pg=3D> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FRO=
M (
          SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY us=
er_id)  AS sub;

  count  |  to_char=20=20
---------+-----------
 1257262 |      2.26
(1 row)

pg=3D> SELECT pg_size_pretty(pg_total_relation_size('old_prefs'));
 pg_size_pretty=20
----------------
 264 MB
(1 row)

pg=3D> \d old_prefs
        Table "public.old_prefs"
 Column  |       Type        | Modifiers=20
---------+-------------------+-----------
 user_id | integer           | not null
 name    | character varying | not null
 value   | character varying | not null
Indexes:
    "old_prefs_user_id_ids" btree (user_id)

Also there are max of 34 rows per user_id in old_prefs

Here is the new table I just created:

pg=3D> \d new_preferences
Table "public.new_preferences"
 Column  |  Type   | Modifiers=20
---------+---------+-----------
 user_id | integer | not null
 prefs   | hstore  |=20
Indexes:
    "new_preferences_pkey" PRIMARY KEY, btree (user_id)
Foreign-key constraints:
    "new_preferences_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(u=
ser_id) ON DELETE CASCADE


Here is the newly connected backend:

root@pg:/var/log# ps axu | egrep '10.0.2.71|USER' | grep -v grep
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  0.0  0.0 2266944 3448 ?        Ss   15:23   0:00 postgres: =
pg pg 10.0.2.71(51734) idle=20=20=20=20=20=20=20=20=20=20=20


Migrating the first 200k of the users to the new scheme:

pg=3D> select count(*) from old_prefs where user_id<200000;
 count=20=20
--------
 174767
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id<200000 GROUP BY user_id;
INSERT 0 48993
pg=3D> commit;
COMMIT

Here is the backend:
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  0.8  7.1 3081772 582712 ?      Ss   15:23   0:02 postgres: =
pg pg 10.0.2.71(51734) idle=20=20=20=20=20=20=20=20=20=20=20=20

Migrating another batch of users:

pg =3D> select count(*) from old_prefs where user_id>=3D200000 and user_id<=
600000;
 count=20=20
--------
 193824
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D200000 AND user_id<600000 =
GROUP BY user_id;
INSERT 0 54157
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  1.1  8.5 3176164 697444 ?      Ss   15:23   0:05 postgres: =
pg pg 10.0.2.71(51734) idle

Another batch:
pg=3D> select count(*) from old_prefs where user_id>=3D600000 and user_id<1=
100000;
 count=20=20
--------
 190504
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D600000 AND user_id<1100000=
 GROUP BY user_id;
INSERT 0 56199
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  0.7  9.6 3210224 791404 ?      Ss   15:23   0:08 postgres: =
pg pg 10.0.2.71(51734) idle

Another batch:

pg=3D> select count(*) from old_prefs where user_id>=3D1100000 and user_id<=
1600000;
 count=20=20
--------
 194965
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D1100000 AND user_id<160000=
0 GROUP BY user_id;
INSERT 0 60257
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  0.9 10.8 3277412 889860 ?      Ss   15:23   0:11 postgres: =
pg pg 10.0.2.71(51734) idle


So Pg backeng keep growing with 100M per 200k row from old table that becam=
e 50-60k rows in the new table
Proceeding with another batch:

pg=3D> select count(*) from old_prefs where user_id>=3D1600000 and user_id<=
2400000;
 count=20=20
--------
 170858
(1 row)

Time: 83,994 ms
pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D1600000 AND user_id<240000=
0 GROUP BY user_id;
INSERT 0 55447
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  0.9 11.5 3277412 945560 ?      Ss   15:23   0:15 postgres: =
pg pg 10.0.2.71(51734) idle

Another batch:

pg=3D> select count(*) from old_prefs where user_id>=3D2400000 and user_id<=
3400000;
 count=20=20
--------
 200614
(1 row)

Time: 83,409 ms
pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D2400000 AND user_id<340000=
0 GROUP BY user_id;
INSERT 0 87940
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  1.2 16.2 3736968 1331796 ?     Ss   15:23   0:20 postgres: =
pg pg 10.0.2.71(51734) idle

Another batch:

pg =3D> select count(*) from old_prefs where user_id>=3D3400000 and user_id=
<3800000;
 count=20=20
--------
 161390
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D3400000 AND user_id<380000=
0 GROUP BY user_id;
ERROR:  insert or update on table "new_preferences" violates foreign key co=
nstraint "new_preferences_user_id_fkey"
DETAIL:  Key (user_id)=3D(3615131) is not present in table "users".
pg=3D> rollback;
ROLLBACK

Ops.. have to cleanup the old_prefs, some users were deleted in the meantim=
e:
pg=3D> delete from old_prefs where user_id not in (select user_id from user=
s);
DELETE 7
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  1.4 26.4 4469520 2157588 ?     Ss   15:23   0:29 postgres: =
pg pg 10.0.2.71(51734) idle

Near 1G grow on rolled back transaction....

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D3400000 AND user_id<380000=
0 GROUP BY user_id;
INSERT 0 131803
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  1.7 26.6 4479944 2180536 ?     Ss   15:23   0:35 postgres: =
pg pg 10.0.2.71(51734) idle

Another batch, bigger this time:

pg=3D> select count(*) from old_prefs where user_id>=3D3800000 and user_id<=
4200000;
 count=20=20
--------
 327374
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D3800000 AND user_id<420000=
0 GROUP BY user_id;
INSERT 0 177044
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  1.9 33.1 5238968 2710756 ?     Ss   15:23   0:45 postgres: =
pg pg 10.0.2.71(51734) idle

Another big batch:

pg=3D> select count(*) from old_prefs where user_id>=3D4200000 and user_id<=
4400000;
 count=20=20
--------
 375352
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D4200000 AND user_id<440000=
0 GROUP BY user_id;
INSERT 0 189095
pg=3D> commit;
COMMIT

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  2.2 35.7 5438412 2918720 ?     Ss   15:23   0:55 postgres: =
pg pg 10.0.2.71(51734) idle

Now a smaller batch:

pg=3D> select count(*) from old_prefs where user_id>=3D4400000 and user_id<=
4500000;
 count=20=20
--------
 219249
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D4400000 AND user_id<450000=
0 GROUP BY user_id;
INSERT 0 99782
pg=3D> commit;
COMMIT

RSS keeps growing:
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  2.2 36.0 5438412 2943784 ?     Ss   15:23   1:00 postgres: =
pg pg 10.0.2.71(51734) idle

Lets see if a bigger batch will pass:

pg=3D> select count(*) from old_prefs where user_id>=3D4500000;
 count=20=20
--------
 631911
(1 row)

pg=3D> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), a=
rray_agg(value)) FROM old_prefs WHERE user_id>=3D4500000  GROUP BY user_id;
INSERT 0 296541
pg=3D> commit;
COMMIT

Ok, this time it passed, but the backend is over 4G
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 19121  2.2 50.0 7227968 4088928 ?     Ss   15:23   1:17 postgres: =
pg pg 10.0.2.71(51734) idle

Some observations:

1. Backend does not free allocated memory between transactions.
2. Rolled back transactions also leak memory.
3. Leaked memory is not linear to work done - 2 transactions with 200k keys=
 will leak less than 1 transaction with 400k keys


Regarding Tom's question:
The old_prefs does not fit in work_mem but is quite small regarding the tot=
al RAM. Isn't the "work_mem" a limit of the memory each backend could alloc=
ate for=20
sorting, grouping and aggregation? My understanding is that bigger allocati=
on will overflow to disk and will not kill the server. I could be wrong tho=
ugh.

Thanks in advance and best regards

--
Luben Karavelov

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
Woah. Your email client did something insane, and I cannot read your
message. See below:

On 07/26/2012 09:37 PM, karavelov@mail.bg wrote:
> ----- Craig Ringer (ringerc@ringerc.id.au), на 26.07.2012 в 11:17 -----
>> On 07/26/2012 09:32 AM, karavelov@mail.bg wrote: >> Finally I have
> managed to migrate it in batches of 100-200k user ids and >> disconnecting
> after each query in order to free the backend and leaked >> memory. > If
> you do it in batches, but you do NOT disconnect and reconnect, does > the
> backend continue to grow? > > What's the output of: > > SELECT
> count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM ( > SELECT
> user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) > AS sub;
>>> and > > SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); > >
> ? > > -- > Craig Ringer > Ok, I will do the procedure again with taking
> notes on each step. First, here are the results of the queries you asked:
> pg=> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM
> ( SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id)
> AS sub; count | to_char ---------+----------- 1257262 | 2.26 (1 row) pg=>
> SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); pg_size_pretty
> ---------------- 264 MB (1 row) pg=> d old_prefs Table "public.old_prefs"
> Column | Type | Modifiers ---------+-------------------+----------- user_id
> | integer | not null name | character varying | not null value | character
> varying | not null Indexes: "old_prefs_user_id_ids" btree (user_id) Also
> there are max of 34 rows per user_id in old_prefs here is the new table I
> just created: pg=> d new_preferences Table "public.new_preferences" Column
> | Type | Modifiers ---------+---------+----------- user_id | integer | not
> null prefs | hstore | Indexes: "new_preferences_pkey" PRIMARY KEY, btree
> (user_id) Foreign-key constraints: "new_preferences_user_id_fkey" FOREIGN
> KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE Here is a newly
> connected the backend: root@pg:/var/log# ps axu | egrep '10.0.2.71|USER' |
> grep -v grep USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
> postgres 19121 0.0 0.0 2266944 3448 ? Ss 15:23 0:00 postgres: pg pg
> 10.0.2.71(51734) idle Migrating the first 200k of the users to the new
> scheme: pg=> select count(*) from old_prefs where user_id INSERT INTO
> new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
> FROM old_prefs WHERE user_id commit; COMMIT Here is the backend: USER PID
> %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.8 7.1
> 3081772 582712 ? Ss 15:23 0:02 postgres: pg pg 10.0.2.71(51734) idle
> Migrating another batch of users: pg => select count(*) from old_prefs
> where user_id>=200000 and user_id INSERT INTO new_preferences SELECT
> user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
> user_id>=200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
> STAT START TIME COMMAND postgres 19121 1.1 8.5 3176164 697444 ? Ss 15:23
> 0:05 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=> select
> count(*) from old_prefs where user_id>=600000 and user_id INSERT INTO
> new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
> FROM old_prefs WHERE user_id>=600000 AND user_id commit; COMMIT USER PID
> %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.7 9.6
> 3210224 791404 ? Ss 15:23 0:08 postgres: pg pg 10.0.2.71(51734) idle
> Another batch: pg=> select count(*) from old_prefs where user_id>=1100000
> and user_id INSERT INTO new_preferences SELECT
> user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
> user_id>=1100000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
> STAT START TIME COMMAND postgres 19121 0.9 10.8 3277412 889860 ? Ss 15:23
> 0:11 postgres: pg pg 10.0.2.71(51734) idle So Pg backeng keep growing with
> 100M per 200k row from old table that became 50-60k rows in the new table
> Proceeding with another batch: pg=> select count(*) from old_prefs where
> user_id>=1600000 and user_id INSERT INTO new_preferences SELECT
> user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
> user_id>=1600000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
> STAT START TIME COMMAND postgres 19121 0.9 11.5 3277412 945560 ? Ss 15:23
> 0:15 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=> select
> count(*) from old_prefs where user_id>=2400000 and user_id INSERT INTO
> new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
> FROM old_prefs WHERE user_id>=2400000 AND user_id commit; COMMIT USER PID
> %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.2 16.2
> 3736968 1331796 ? Ss 15:23 0:20 postgres: pg pg 10.0.2.71(51734) idle
> Another batch: pg => select count(*) from old_prefs where user_id>=3400000
> and user_id INSERT INTO new_preferences SELECT
> user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
> user_id>=3400000 AND user_id rollback; ROLLBACK Ops.. have to cleanup the
> old_prefs, some users were deleted in the meantime: pg=> delete from
> old_prefs where user_id not in (select user_id from users); DELETE 7 pg=>
> commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
> postgres 19121 1.4 26.4 4469520 2157588 ? Ss 15:23 0:29 postgres: pg pg
> 10.0.2.71(51734) idle Near 1G grow on rolled back transaction.... pg=>
> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name),
> array_agg(value)) FROM old_prefs WHERE user_id>=3400000 AND user_id commit;
> COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
> 19121 1.7 26.6 4479944 2180536 ? Ss 15:23 0:35 postgres: pg pg
> 10.0.2.71(51734) idle Another batch, bigger this time: pg=> select count(*)
> from old_prefs where user_id>=3800000 and user_id INSERT INTO
> new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
> FROM old_prefs WHERE user_id>=3800000 AND user_id commit; COMMIT USER PID
> %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.9 33.1
> 5238968 2710756 ? Ss 15:23 0:45 postgres: pg pg 10.0.2.71(51734) idle
> Another big batch: pg=> select count(*) from old_prefs where
> user_id>=4200000 and user_id INSERT INTO new_preferences SELECT
> user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
> user_id>=4200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
> STAT START TIME COMMAND postgres 19121 2.2 35.7 5438412 2918720 ? Ss 15:23
> 0:55 postgres: pg pg 10.0.2.71(51734) idle Now a smaller batch: pg=> select
> count(*) from old_prefs where user_id>=4400000 and user_id INSERT INTO
> new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
> FROM old_prefs WHERE user_id>=4400000 AND user_id commit; COMMIT RSS keeps
> growing: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
> 19121 2.2 36.0 5438412 2943784 ? Ss 15:23 1:00 postgres: pg pg
> 10.0.2.71(51734) idle Lets see if a bigger batch will pass: pg=> select
> count(*) from old_prefs where user_id>=4500000; count -------- 631911 (1
> row) pg=> INSERT INTO new_preferences SELECT
> user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
> user_id>=4500000 GROUP BY user_id; INSERT 0 296541 pg=> commit; COMMIT Ok,
> this time it passed, but the backend is over 4G USER PID %CPU %MEM VSZ RSS
> TTY STAT START TIME COMMAND postgres 19121 2.2 50.0 7227968 4088928 ? Ss
> 15:23 1:17 postgres: pg pg 10.0.2.71(51734) idle Some observations: 1.
> Backend does not free allocated memory between transactions. 2. Rolled back
> transactions also leak memory. 3. Leaked memory is not linear to work done
> - 2 transactions with 200k keys will leak less than 1 transaction with 400k
> keys Regarding Tom's question: The old_prefs does not fit in work_mem but
> is quite small regarding the total RAM. Isn't the "work_mem" a limit of the
> memory each backend could allocate for sorting, grouping and aggregation?
> My understanding is that bigger allocation will overflow to disk and will
> not kill the server. I could be wrong though. Thanks in advance and best
> regards -- Luben Karavelov
>

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
Hi all

Here's a fully self contained, automated test case that demonstrates the
leak.

Example output on my system, pasted as quote to stop Thunderbird
mangling it:

> $ ./hstore-leak-demo.sh
> NOTICE:  extension "hstore" already exists, skipping
> CREATE EXTENSION
> DROP TABLE
> CREATE TABLE
> CREATE TABLE
> INSERT 0 100000
> INSERT 0 80175
> INSERT 0 72267
> INSERT 0 50649
> INSERT 0 30430
>         avg         | max
> --------------------+-----
>  3.3352100000000000 |  20
> (1 row)
>
> Backend PID is: 3167
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167  0.0  0.0 504276  4368 ?        Ss   10:29   0:00 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 91.0 13.0 2163384 1055652 ?     Ss   10:29   0:00 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 77.5 13.2 2163408 1071496 ?     Ss   10:29   0:01 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167  108 13.4 2163408 1084056 ?     Ss   10:29   0:02 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 93.0 13.5 2163408 1092244 ?     Ss   10:29   0:02 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 86.0 13.6 2163408 1100444 ?     Ss   10:29   0:03 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167  101 13.7 2163408 1108704 ?     Ss   10:29   0:04 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 93.8 13.8 2163408 1116896 ?     Ss   10:29   0:04 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 88.8 13.9 2163408 1125048 ?     Ss   10:29   0:05 postgres: craig regress [local] idle
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres  3167 99.1 14.0 2163408 1133228 ?     Ss   10:29   0:05 postgres: craig regress [local] idle


Attachment

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
On 07/26/2012 09:55 PM, luben karavelov wrote:

> Ok, I will do the procedure again with taking notes on each step.

Thankyou for taking the time to do this in detail.

> First, here are the results of the queries you asked:
>    count  |  to_char
> ---------+-----------
>   1257262 |      2.26

>   pg_size_pretty
> ----------------
>   264 MB

OK, none of that looks obviously insanely huge to me. That's a lot of
hstores, but with your query I wouldn't expect them to all sit around in
memory, and nothing individually is particularly huge.

> Also there are max of 34 rows per user_id in old_prefs

Thanks, I forgot to ask that. Again, nothing particularly big.

> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> postgres 19121  0.0  0.0 2266944 3448 ?        Ss   15:23   0:00 postgres: pg pg 10.0.2.71(51734) idle
> postgres 19121  0.8  7.1 3081772 582712 ?      Ss   15:23   0:02 postgres: pg pg 10.0.2.71(51734) idle
> postgres 19121  1.1  8.5 3176164 697444 ?      Ss   15:23   0:05 postgres: pg pg 10.0.2.71(51734) idle
> postgres 19121  0.7  9.6 3210224 791404 ?      Ss   15:23   0:08 postgres: pg pg 10.0.2.71(51734) idle
> postgres 19121  0.9 10.8 3277412 889860 ?      Ss   15:23   0:11 postgres: pg pg 10.0.2.71(51734) idle
> postgres 19121  0.9 11.5 3277412 945560 ?      Ss   15:23   0:15 postgres: pg pg 10.0.2.71(51734) idle
> postgres 19121  1.2 16.2 3736968 1331796 ?     Ss   15:23   0:20 postgres: pg pg 10.0.2.71(51734) idle

> 1. Backend does not free allocated memory between transactions.
> 2. Rolled back transactions also leak memory.
> 3. Leaked memory is not linear to work done - 2 transactions with 200k keys will leak less than 1 transaction with
400kkeys 

Ouch. Sure looks like a leak to me, yeah.

Thankyou for taking the time to do this. It's common to see "leaks"
reported here that are really just queries that require lots of memory,
so the first response tends to be somewhat cautious.

This doesn't look like one of those reports.

I don't know if I can be much use with the actual tracking down of the
leak, but there certainly appears to be one, and you've provided a
pretty clear illustration of it.

--
Craig Ringer

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
On 07/27/2012 10:30 AM, Craig Ringer wrote:
> Hi all
>
> Here's a fully self contained, automated test case that demonstrates
> the leak.

Gah. Except it doesn't now, as shown by the text I pasted. WTF?

I was *definitely* seeing this on my system. What's changed?

Will follow up.

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
OK, it's certainly leaking, but not in the same drastic way I was able
to reproduce manually a couple of times earlier. Self-contained test
case attached.

$ bash hstore-leak-demo.sh
NOTICE:  extension "hstore" already exists, skipping
CREATE EXTENSION
DROP TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 100000
INSERT 0 80014
INSERT 0 72434
INSERT 0 50340
INSERT 0 30077
         avg         | max
--------------------+-----
  3.3286500000000000 |  21
(1 row)

Backend PID is: 4823
                  USER       PID %CPU %MEM    VSZ   RSS TTY STAT START
TIME COMMAND
Before first     postgres  4823  0.0  0.0 504276  4312 ? Ss   11:19
0:00 postgres: craig regress [local] idle
Iteration 1      postgres  4823  0.0  0.3 536908 25416 ? Rs   11:19
0:00 postgres: craig regress [local] INSERT
Iteration 2      postgres  4823 33.0 13.1 2163384 1056560 ? Rs   11:19
0:00 postgres: craig regress [local] INSERT
Iteration 3      postgres  4823 56.0 13.3 2163408 1072300 ? Rs   11:19
0:01 postgres: craig regress [local] INSERT
Iteration 4      postgres  4823 58.7 13.4 2163408 1084936 ? Rs   11:19
0:02 postgres: craig regress [local] INSERT
Iteration 20     postgres  4823 85.3 14.3 2173776 1156784 ? Rs   11:19
0:13 postgres: craig regress [local] INSERT
Iteration 40     postgres  4823 92.0 16.3 2176848 1314700 ? Rs   11:19
0:28 postgres: craig regress [local] INSERT
Iteration 60     postgres  4823 94.1 16.4 2173776 1322208 ? Rs   11:19
0:43 postgres: craig regress [local] INSERT
Iteration 80     postgres  4823 96.0 16.4 2173776 1323768 ? Rs   11:19
0:58 postgres: craig regress [local] INSERT
Iteration 100    postgres  4823 95.7 16.5 2176848 1329880 ? Rs   11:19
1:14 postgres: craig regress [local] INSERT
Iteration 120    postgres  4823 97.1 16.4 2176848 1329132 ? Rs   11:19
1:31 postgres: craig regress [local] INSERT
Iteration 140    postgres  4823 96.8 16.4 2176848 1329524 ? Rs   11:19
1:48 postgres: craig regress [local] INSERT


Attachment

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Tom Lane
Date:
Craig Ringer <ringerc@ringerc.id.au> writes:
> OK, it's certainly leaking, but not in the same drastic way I was able
> to reproduce manually a couple of times earlier. Self-contained test
> case attached.

Using HEAD with stock parameters, I don't see any significant change in
allocated address space (VSZ): it sits right around 170MB.  The reported
resident set size (RSS) starts from very little and rises to about
140MB, but I think that's just an artifact of the process touching more
and more of the shared-buffers array as it runs.  The actual backend
memory consumption seems to be just a few meg.

I can get it to blow out memory if I set work_mem large enough to
persuade the planner to use hash aggregation (whereupon it tries to run
all the array_agg aggregates in parallel).  However, that requires
work_mem set to a couple of GB, and I don't think it's really a bug when
the backend goes ahead and uses a couple of GB after I told it it could.

It's possible that the OP's problem boiled down to the planner making
a drastic underestimate of the number of GROUP BY groups, which could
mislead it into applying hash aggregation when there's not room; or
if the space used per aggregate was a lot more than the 8K that the
planner assumes when dealing with array_agg.  But neither of those
errors seems to be happening in this example case.

            regards, tom lane

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
On 07/27/2012 01:47 PM, Tom Lane wrote:
> Craig Ringer <ringerc@ringerc.id.au> writes:
>> OK, it's certainly leaking, but not in the same drastic way I was able
>> to reproduce manually a couple of times earlier. Self-contained test
>> case attached.
> Using HEAD with stock parameters, I don't see any significant change in
> allocated address space (VSZ): it sits right around 170MB.  The reported
> resident set size (RSS) starts from very little and rises to about
> 140MB, but I think that's just an artifact of the process touching more
> and more of the shared-buffers array as it runs.
Gah. I should know better than that. Sorry.

This makes me wonder if the "leak-like" pattern I saw earlier was just a
similar growth in shared_buffers, and carried on more steeply rather
than tapering off because I was working with a smaller data set.

--
Craig Ringer

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
On 07/27/2012 10:31 AM, Craig Ringer wrote:

> Ouch. Sure looks like a leak to me, yeah.
... but it turns out I'm not thinking very straight. I forgot to check
the size of your `shared_buffers' or `work_mem' and forgot to get you to
report `free -m' output after each run to measure _real_ memory use.

During testing I did got a backend crash when running an INSERT - which
I didn't expect given that I have only 20MB of work_mem and 256MB of
shared_buffers. I was surprised by that as I would not have expected
that query to require a huge gob of RAM. I didn't dig much further as
I'm on a swapless system with overcommit enabled ( 'cos the Java VM
does't work with overcommit off ) which isn't exactly a recommended Pg
configuration.

--
Craig Ringer

Re: BUG #6763: Severe memory leak with arrays and hstore

From
luben karavelov
Date:
On Jul 27, 2012, at 8:47 AM, Tom Lane wrote:

> Craig Ringer <ringerc@ringerc.id.au> writes:
>> OK, it's certainly leaking, but not in the same drastic way I was able=
=20
>> to reproduce manually a couple of times earlier. Self-contained test=20
>> case attached.
>=20
> Using HEAD with stock parameters, I don't see any significant change in
> allocated address space (VSZ): it sits right around 170MB.  The reported
> resident set size (RSS) starts from very little and rises to about
> 140MB, but I think that's just an artifact of the process touching more
> and more of the shared-buffers array as it runs.  The actual backend
> memory consumption seems to be just a few meg.
>=20
> I can get it to blow out memory if I set work_mem large enough to
> persuade the planner to use hash aggregation (whereupon it tries to run
> all the array_agg aggregates in parallel).  However, that requires
> work_mem set to a couple of GB, and I don't think it's really a bug when
> the backend goes ahead and uses a couple of GB after I told it it could.
>=20
> It's possible that the OP's problem boiled down to the planner making
> a drastic underestimate of the number of GROUP BY groups, which could
> mislead it into applying hash aggregation when there's not room; or
> if the space used per aggregate was a lot more than the 8K that the
> planner assumes when dealing with array_agg.  But neither of those
> errors seems to be happening in this example case.
>=20
>             regards, tom lane

It's good that the bug is not in HEAD. I was testing on 9.1.4. Definitely t=
he size of RSS is not just references to shared buffers because they are 1.=
8G and the backend RSS got to 4G. My setting for work_mem is 64M, so it's q=
uite conservative - the server was tuned for max concurrency, not for max t=
hroughput per single query.

Here is the plan of the insert:

=3D> explain INSERT INTO new_preferences SELECT user_id,hstore(array_agg(na=
me), array_agg(value)) FROM old_prefs WHERE user_id<200000  GROUP BY user_i=
d;
                                                QUERY PLAN=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
---------------------------------------------------------------------------=
-------------------------------
 Insert on new_preferences  (cost=3D65615.89..65617.73 rows=3D67 width=3D36)
   ->  HashAggregate  (cost=3D65615.89..65617.06 rows=3D67 width=3D68)
         ->  Bitmap Heap Scan on old_prefs  (cost=3D17645.25..56555.65 rows=
=3D1208032 width=3D68)
               Recheck Cond: (user_id < 200000)
               ->  Bitmap Index Scan on old_prefs_user_id_ids  (cost=3D0.00=
..17343.24 rows=3D1208032 width=3D0)
                     Index Cond: (user_id < 200000)
(6 rows)

So, it is using hash aggregate as you have suggested. I have tried the quer=
y with disabled hash aggregate and it consumes a lot less memory - single q=
uery to migrate the whole table finishes with 900M RSS.

After "ANALYZE old_prefs" the planner chooses GroupAggregate instead of Has=
hAggregate - you were right about missing statistics of old_prefs.

Thank you for figuring out this case

Best regards

--
luben karavelov

Re: BUG #6763: Severe memory leak with arrays and hstore

From
Craig Ringer
Date:
On 07/27/2012 07:52 PM, luben karavelov wrote:
>
> It's good that the bug is not in HEAD. I was testing on 9.1.4.
So was I, and while I thought I'd reproduced it I now suspect I was just
seeing shared_buffers touching.

Are you able to produce a self-contained SQL test that demonstrates the
leak?

Does the test program I posted behave differently on your system?

--
Craig Ringer