Thread: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)

On Wed, Feb 26, 2014 at 04:01:13PM +0100, Sandro Santilli wrote:
> On Fri, Feb 21, 2014 at 08:17:36PM -0500, Tom Lane wrote:
> > I wrote:
> > > Craig Ringer <craig@2ndquadrant.com> writes:
> > >> So I'd like to confirm that this issue doesn't affect 9.1.
> >
> > > It doesn't.  I suspect it has something to do with 173e29aa5 or one
> > > of the nearby commits in backend/regex/.
> >
> > Indeed, git bisect fingers that commit as introducing the problem.
>
> Is there anything I can do for this to not fade to oblivion ?
> Will a mail to pgsql-bugs help ?

Tom: I saw you pushed a fix for the interruptability with this:
https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0

But we also noticed a memory leak in the regepx_replace call, did you notice
that in your tests ? Same regexp as reported. Do you need another testcase ?

--strk;

 ()  ASCII ribbon campaign  --  Keep it simple !
 /\  http://strk.keybit.net/rants/ascii_mails.txt
Sandro Santilli <strk@keybit.net> writes:
> Tom: I saw you pushed a fix for the interruptability with this:
> https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0

> But we also noticed a memory leak in the regepx_replace call, did you notice
> that in your tests ? Same regexp as reported. Do you need another testcase ?

I see no memory growth while running your original example against HEAD.

            regards, tom lane
On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> Sandro Santilli <strk@keybit.net> writes:
> > Tom: I saw you pushed a fix for the interruptability with this:
> > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
>
> > But we also noticed a memory leak in the regepx_replace call, did you notice
> > that in your tests ? Same regexp as reported. Do you need another testcase ?
>
> I see no memory growth while running your original example against HEAD.

This is what I'm running:

 SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );

And this is what I see as the process memory:

  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 5359 ?        Ss     0:00      0  5693 172446 2908  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 172718 4224  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 173250 4488  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 173670 5016  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 174070 5280  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 174494 5544  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 174762 5808  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 175690 6600  0.0 postgres: strk strk [local] idle
 5359 ?        Ss     0:00      0  5693 179006 9240  0.0 postgres: strk strk [local] idle

This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25

Or, simpler, from this state:

  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 5359 ?        Ss     0:00      0  5693 1274962 869832  5.3 postgres: strk strk [local] idle

I run this:

 SELECT count(*) from (
   select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
   FROM generate_series(1, 10000)
 ) as f;

And get here:

  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 5359 ?        Ss     0:00      0  5693 1274962 869832  5.3 postgres: strk strk [local] idle
 5359 ?        Rs     0:00      0  5693 1555894 1090536  6.6 postgres: strk strk [local] SELECT
 5359 ?        Ss     0:00      0  5693 2269682 1650744 10.0 postgres: strk strk [local] idle

The memory is only released on session close.
This is with generate_series going up to 1e5:

  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 5359 ?        Ss     0:00      0  5693 2269682 1650744 10.0 postgres: strk strk [local] idle
 5359 ?        Rs     0:00      0  5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:01      0  5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:02      0  5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
 5359 ?        Ss     0:03      0  5693 12216566 9457488 57.8 postgres: strk strk [local] idle

Up to 1e6 the story ends with a crash, after:

  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 5359 ?        Ss     0:03      0  5693 12216566 9457488 57.8 postgres: strk strk [local] idle
 5359 ?        Rs     0:04      0  5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:05      0  5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:06      0  5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:07      1  5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:08      1  5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:09      1  5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:10      1  5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:11      1  5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:12      1  5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
 5359 ?        Ds     0:12      1  5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:13      1  5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:13      1  5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:13      1  5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
 5359 ?        Ds     0:14      1  5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
 5359 ?        Ds     0:14      1  5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
 5359 ?        Ds     0:14      1  5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:15      1  5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:15      1  5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:15      1  5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
 5359 ?        Ds     0:15      1  5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
 5359 ?        Ds     0:15      1  5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
 5359 ?        Rs     0:15   1833     0     0     0  0.0 [postgres]

The connection to the server was lost. Attempting reset: Failed.
!>

The configure switches:

./configure \
        --prefix=/home/postgresql-9.3.4 \
        --with-libxml \
        --with-python \
        --enable-nls \
        --enable-cassert \
        --enable-debug \
        --with-ossp-uuid

--strk;

 ()  ASCII ribbon campaign  --  Keep it simple !
 /\  http://strk.keybit.net/rants/ascii_mails.txt
A valgrind view on the matter (over a single call, in --single mode):

==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 203 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==    by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240==    by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==    by 0x5A1E77: ExecProcNode (execProcnode.c:373)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 204 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B7EE: cdissect (regexec.c:671)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==    by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240==    by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 205 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 206 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B7EE: cdissect (regexec.c:671)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 207 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B97C: cdissect (regexec.c:975)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 227 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==    by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240==    by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 228 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B7EE: cdissect (regexec.c:671)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==    by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 229 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B80A: cdissect (regexec.c:673)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==    by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==    by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 230 of 239
==21240==    at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240==    by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
==21240==    by 0x64B512: cdissect (regexec.c:903)
==21240==    by 0x64BAF8: cdissect (regexec.c:1050)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64B46B: cdissect (regexec.c:633)
==21240==    by 0x64B8AA: cdissect (regexec.c:689)
==21240==    by 0x64BDE9: cdissect (regexec.c:692)
==21240==    by 0x64C802: pg_regexec (regexec.c:473)
==21240==    by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==
==21240== LEAK SUMMARY:
==21240==    definitely lost: 69,064 bytes in 12 blocks
==21240==    indirectly lost: 38,143 bytes in 102 blocks
==21240==      possibly lost: 0 bytes in 0 blocks
==21240==    still reachable: 1,317,683 bytes in 342 blocks

--strk;

On Wed, Mar 19, 2014 at 08:53:59AM +0100, Sandro Santilli wrote:
> On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> > Sandro Santilli <strk@keybit.net> writes:
> > > Tom: I saw you pushed a fix for the interruptability with this:
> > > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
> >
> > > But we also noticed a memory leak in the regepx_replace call, did you notice
> > > that in your tests ? Same regexp as reported. Do you need another testcase ?
> >
> > I see no memory growth while running your original example against HEAD.
>
> This is what I'm running:
>
>  SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );
>
> And this is what I see as the process memory:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 172446 2908  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 172718 4224  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 173250 4488  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 173670 5016  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 174070 5280  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 174494 5544  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 174762 5808  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 175690 6600  0.0 postgres: strk strk [local] idle
>  5359 ?        Ss     0:00      0  5693 179006 9240  0.0 postgres: strk strk [local] idle
>
> This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25
>
> Or, simpler, from this state:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 1274962 869832  5.3 postgres: strk strk [local] idle
>
> I run this:
>
>  SELECT count(*) from (
>    select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
>    FROM generate_series(1, 10000)
>  ) as f;
>
> And get here:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 1274962 869832  5.3 postgres: strk strk [local] idle
>  5359 ?        Rs     0:00      0  5693 1555894 1090536  6.6 postgres: strk strk [local] SELECT
>  5359 ?        Ss     0:00      0  5693 2269682 1650744 10.0 postgres: strk strk [local] idle
>
> The memory is only released on session close.
> This is with generate_series going up to 1e5:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:00      0  5693 2269682 1650744 10.0 postgres: strk strk [local] idle
>  5359 ?        Rs     0:00      0  5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:01      0  5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:02      0  5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
>  5359 ?        Ss     0:03      0  5693 12216566 9457488 57.8 postgres: strk strk [local] idle
>
> Up to 1e6 the story ends with a crash, after:
>
>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>  5359 ?        Ss     0:03      0  5693 12216566 9457488 57.8 postgres: strk strk [local] idle
>  5359 ?        Rs     0:04      0  5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:05      0  5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:06      0  5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:07      1  5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:08      1  5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:09      1  5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:10      1  5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:11      1  5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:12      1  5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:12      1  5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:13      1  5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:13      1  5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:13      1  5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:14      1  5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:14      1  5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:14      1  5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15      1  5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15      1  5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15      1  5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:15      1  5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Ds     0:15      1  5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
>  5359 ?        Rs     0:15   1833     0     0     0  0.0 [postgres]
>
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> The configure switches:
>
> ./configure \
>         --prefix=/home/postgresql-9.3.4 \
>         --with-libxml \
>         --with-python \
>         --enable-nls \
>         --enable-cassert \
>         --enable-debug \
>         --with-ossp-uuid
>
> --strk;
>
>  ()  ASCII ribbon campaign  --  Keep it simple !
>  /\  http://strk.keybit.net/rants/ascii_mails.txt

--

 ()  ASCII ribbon campaign  --  Keep it simple !
 /\  http://strk.keybit.net/rants/ascii_mails.txt
On Wed, Mar 19, 2014 at 8:57 AM, Sandro Santilli <strk@keybit.net> wrote:
> ==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
> ==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
> ==21240==    by 0x64B80A: cdissect (regexec.c:673)
> ==21240==    by 0x64C802: pg_regexec (regexec.c:473)


It looks like a simple bug pg_regexec where it's reusing a loop
counter "n" to mean two different things. When it goes to free all the
subdfas it looks like the code was written based "n" still being the
number of trees but in fact it's been reused to be the number of
matches.


--
greg
On Wed, Mar 19, 2014 at 10:57:10AM +0000, Greg Stark wrote:
> On Wed, Mar 19, 2014 at 8:57 AM, Sandro Santilli <strk@keybit.net> wrote:
> > ==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
> > ==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
> > ==21240==    by 0x64B80A: cdissect (regexec.c:673)
> > ==21240==    by 0x64C802: pg_regexec (regexec.c:473)
>
>
> It looks like a simple bug pg_regexec where it's reusing a loop
> counter "n" to mean two different things. When it goes to free all the
> subdfas it looks like the code was written based "n" still being the
> number of trees but in fact it's been reused to be the number of
> matches.

I confirm this patch fixes the leak:

diff --git a/src/backend/regex/regexec.c b/src/backend/regex/regexec.c
index 0edb83c..2e97662 100644
--- a/src/backend/regex/regexec.c
+++ b/src/backend/regex/regexec.c
@@ -259,6 +259,7 @@ pg_regexec(regex_t *re,
        /* clean up */
        if (v->pmatch != pmatch && v->pmatch != mat)
                FREE(v->pmatch);
+       n = (size_t) v->g->ntree;
        for (i = 0; i < n; i++)
        {
                if (v->subdfas[i] != NULL)

--strk;
On Wed, Mar 19, 2014 at 12:43:25PM +0100, Sandro Santilli wrote:
> On Wed, Mar 19, 2014 at 10:57:10AM +0000, Greg Stark wrote:
> > On Wed, Mar 19, 2014 at 8:57 AM, Sandro Santilli <strk@keybit.net> wrote:
> > > ==21240==    by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
> > > ==21240==    by 0x64A4C3: getsubdfa (regexec.c:285)
> > > ==21240==    by 0x64B80A: cdissect (regexec.c:673)
> > > ==21240==    by 0x64C802: pg_regexec (regexec.c:473)
> >
> >
> > It looks like a simple bug pg_regexec where it's reusing a loop
> > counter "n" to mean two different things. When it goes to free all the
> > subdfas it looks like the code was written based "n" still being the
> > number of trees but in fact it's been reused to be the number of
> > matches.
>
> I confirm this patch fixes the leak:

I've encoded another version of it as a pull request:
https://github.com/postgres/postgres/pull/5

--strk;
Greg Stark <stark@mit.edu> writes:
> It looks like a simple bug pg_regexec where it's reusing a loop
> counter "n" to mean two different things. When it goes to free all the
> subdfas it looks like the code was written based "n" still being the
> number of trees but in fact it's been reused to be the number of
> matches.

Looks like it's my bug (how embarrassing).  Thanks!

            regards, tom lane