Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?) - Mailing list pgsql-bugs

From Sandro Santilli
Subject Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date
Msg-id 20140319085702.GF4042@localhost
Whole thread Raw
In response to Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)  (Sandro Santilli <strk@keybit.net>)
Responses Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)  (Greg Stark <stark@mit.edu>)
List pgsql-bugs
A valgrind view on the matter (over a single call, in --single mode):

==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 203 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==    by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240==    by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==    by 0x5A1E77: ExecProcNode (execProcnode.c:373)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 204 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B7EE: cdissect (regexec.c:671)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==    by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240==    by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 205 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 206 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B7EE: cdissect (regexec.c:671)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 207 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B97C: cdissect (regexec.c:975)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 227 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==    by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240==    by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 228 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B7EE: cdissect (regexec.c:671)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 229 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 230 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B512: cdissect (regexec.c:903)
==21240==    by 0x64BAF8: cdissect (regexec.c:1050)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==
==21240== LEAK SUMMARY:
==21240==    definitely lost: 69,064 bytes in 12 blocks
==21240==    indirectly lost: 38,143 bytes in 102 blocks
==21240==      possibly lost: 0 bytes in 0 blocks
==21240==    still reachable: 1,317,683 bytes in 342 blocks

--strk;

On Wed, Mar 19, 2014 at 08:53:59AM +0100, Sandro Santilli wrote:
> On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> > Sandro Santilli <strk@keybit.net> writes:
> > > Tom: I saw you pushed a fix for the interruptability with this:
> > > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
> >
> > > But we also noticed a memory leak in the regepx_replace call, did you notice
> > > that in your tests ? Same regexp as reported. Do you need another testcase ?
> >
> > I see no memory growth while running your original example against HEAD.
>
> This is what I'm running:
>
>  SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );
>
> And this is what I see as the process memory:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 172446 2908  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 172718 4224  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 173250 4488  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 173670 5016  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 174070 5280  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 174494 5544  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 174762 5808  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 175690 6600  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 179006 9240  0.0 postgres: strk strk [local] idle
>
> This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25
>
> Or, simpler, from this state:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 1274962 869832  5.3 postgres: strk strk [local] idle
>
> I run this:
>
>  SELECT count(*) from (
>    select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
>    FROM generate_series(1, 10000)
>  ) as f;
>
> And get here:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 1274962 869832  5.3 postgres: strk strk [local] idle
>  5359 ?        Rs     0:00      0  5693 1555894 1090536  6.6 postgres: strk strk [local] SELECT
>  5359 ?        Ss     0:00      0  5693 2269682 1650744 10.0 postgres: strk strk [local] idle
>
> The memory is only released on session close.
> This is with generate_series going up to 1e5:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 2269682 1650744 10.0 postgres: strk strk [local] idle
>  5359 ?        Rs     0:00      0  5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:01      0  5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:02      0  5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
>  5359 ?        Ss     0:03      0  5693 12216566 9457488 57.8 postgres: strk strk [local] idle
>
> Up to 1e6 the story ends with a crash, after:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:03      0  5693 12216566 9457488 57.8 postgres: strk strk [local] idle
>  5359 ?        Rs     0:04      0  5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:05      0  5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:06      0  5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:07      1  5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:08      1  5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:09      1  5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:10      1  5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:11      1  5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:12      1  5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:12      1  5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:13      1  5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:13      1  5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:13      1  5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:14      1  5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:14      1  5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:14      1  5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15      1  5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15      1  5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15      1  5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:15      1  5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:15      1  5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15   1833     0     0     0  0.0 [postgres]
>
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> The configure switches:
>
> ./configure \
>         --prefix=/home/postgresql-9.3.4 \
>         --with-libxml \
>         --with-python \
>         --enable-nls \
>         --enable-cassert \
>         --enable-debug \
>         --with-ossp-uuid
>
> --strk;
>
>  ()  ASCII ribbon campaign  --  Keep it simple !
>  /\  http://strk.keybit.net/rants/ascii_mails.txt

--

 ()  ASCII ribbon campaign  --  Keep it simple !
 /\  http://strk.keybit.net/rants/ascii_mails.txt

pgsql-bugs by date:

Previous
From: Jesse Denardo
Date:
Subject: Re: BUG #9616: Materialized view with indexes unable to load from pg_dump
Next
From: Greg Stark
Date:
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)