Thread: Re all: Report some potential memory leak bugs in pg_dump.c
However, there are also some potential leaks caused by convertTSFunction() rather than getFormattedTypeName(). Besides, in convertTSFunction(), there are no any comment to say that the memory should not be freed by the caller.
12262 /*
12263 * Convert a function OID obtained from pg_ts_parser or pg_ts_template
12264 *
12265 * It is sufficient to use REGPROC rather than REGPROCEDURE, since the
12266 * argument lists of these functions are predetermined. Note that the
12267 * caller should ensure we are in the proper schema, because the results
12268 * are search path dependent!
12269 */
12270 static char *
12271 convertTSFunction(Archive *fout, Oid funcOid)
12272 {
12273 char *result;
12274 char query[128];
12275 PGresult *res;
12276
12277 snprintf(query, sizeof(query),
12278 "SELECT '%u'::pg_catalog.regproc", funcOid);
12279 res = ExecuteSqlQueryForSingleRow(fout, query);
12280
12281 result = pg_strdup(PQgetvalue(res, 0, 0));
12282
12283 PQclear(res);
12284
12285 return result;
12286 }
The following seven suspects are caused by convertTSFunction().
5) At lines 13520 to 13521, in function dumpTSParser().
13520 appendPQExpBuffer(q, " START = %s,\n",
13521 convertTSFunction(fout, prsinfo->prsstart));
6) At lines 13522 to 13523, in function dumpTSParser().
13522 appendPQExpBuffer(q, " GETTOKEN = %s,\n",
13523 convertTSFunction(fout, prsinfo->prstoken));
7) At lines 13524 to 13525, in function dumpTSParser().
13524 appendPQExpBuffer(q, " END = %s,\n",
13525 convertTSFunction(fout, prsinfo->prsend));
8) At lines 13527 to 13528, in function dumpTSParser().
13527 appendPQExpBuffer(q, " HEADLINE = %s,\n",
13528 convertTSFunction(fout, prsinfo->prsheadline));
9) At lines 13529 to 13530, in function dumpTSParser().
13529 appendPQExpBuffer(q, " LEXTYPES = %s );\n",
13530 convertTSFunction(fout, prsinfo->prslextype));
10) At lines 13665 to13666, in function dumpTSTemplate().
13665 appendPQExpBuffer(q, " INIT = %s,\n",
13666 convertTSFunction(fout, tmplinfo->tmplinit));
11) At lines 13667 to 13668, in function dumpTSTemplate().
13667 appendPQExpBuffer(q, " LEXIZE = %s );\n",
13668 convertTSFunction(fout, tmplinfo->tmpllexize));
>"David G. Johnston" <david.g.johnston@gmail.com> writes:
>> On Fri, Feb 18, 2022 at 10:59 PM <wliang@stu.xidian.edu.cn> wrote:
>>> Specifically, at line 10545 and line 10546, function
>>> getFormattedTypeName() is called, which allocates a chunk of memory by
>>> using pg_strdup() and returns it.
>
>> I'm not a C programmer but am operating under the assumption that you are
>> probably incorrect. So I took a cursory look at the code (in HEAD),
>> starting with the function comment. It says:
>> "* Note that the result is cached and must not be freed by the caller."
>
>There's also this in the body of the function:
>
> /*
> * Cache the result for re-use in later requests, if possible. If we
> * don't have a TypeInfo for the type, the string will be leaked once the
> * caller is done with it ... but that case really should not happen, so
> * leaking if it does seems acceptable.
> */
>
>Since getTypes() makes a TypeInfo for every row of pg_type, "really
>should not happen" is an accurate statement. You'd pretty much have to
>be dealing with a catalog-corruption scenario for the non-cached path
>to be taken.
>
> regards, tom lane
12262 /*
12263 * Convert a function OID obtained from pg_ts_parser or pg_ts_template
12264 *
12265 * It is sufficient to use REGPROC rather than REGPROCEDURE, since the
12266 * argument lists of these functions are predetermined. Note that the
12267 * caller should ensure we are in the proper schema, because the results
12268 * are search path dependent!
12269 */
12270 static char *
12271 convertTSFunction(Archive *fout, Oid funcOid)
12272 {
12273 char *result;
12274 char query[128];
12275 PGresult *res;
12276
12277 snprintf(query, sizeof(query),
12278 "SELECT '%u'::pg_catalog.regproc", funcOid);
12279 res = ExecuteSqlQueryForSingleRow(fout, query);
12280
12281 result = pg_strdup(PQgetvalue(res, 0, 0));
12282
12283 PQclear(res);
12284
12285 return result;
12286 }
The following seven suspects are caused by convertTSFunction().
5) At lines 13520 to 13521, in function dumpTSParser().
13520 appendPQExpBuffer(q, " START = %s,\n",
13521 convertTSFunction(fout, prsinfo->prsstart));
6) At lines 13522 to 13523, in function dumpTSParser().
13522 appendPQExpBuffer(q, " GETTOKEN = %s,\n",
13523 convertTSFunction(fout, prsinfo->prstoken));
7) At lines 13524 to 13525, in function dumpTSParser().
13524 appendPQExpBuffer(q, " END = %s,\n",
13525 convertTSFunction(fout, prsinfo->prsend));
8) At lines 13527 to 13528, in function dumpTSParser().
13527 appendPQExpBuffer(q, " HEADLINE = %s,\n",
13528 convertTSFunction(fout, prsinfo->prsheadline));
9) At lines 13529 to 13530, in function dumpTSParser().
13529 appendPQExpBuffer(q, " LEXTYPES = %s );\n",
13530 convertTSFunction(fout, prsinfo->prslextype));
10) At lines 13665 to13666, in function dumpTSTemplate().
13665 appendPQExpBuffer(q, " INIT = %s,\n",
13666 convertTSFunction(fout, tmplinfo->tmplinit));
11) At lines 13667 to 13668, in function dumpTSTemplate().
13667 appendPQExpBuffer(q, " LEXIZE = %s );\n",
13668 convertTSFunction(fout, tmplinfo->tmpllexize));
>"David G. Johnston" <david.g.johnston@gmail.com> writes:
>> On Fri, Feb 18, 2022 at 10:59 PM <wliang@stu.xidian.edu.cn> wrote:
>>> Specifically, at line 10545 and line 10546, function
>>> getFormattedTypeName() is called, which allocates a chunk of memory by
>>> using pg_strdup() and returns it.
>
>> I'm not a C programmer but am operating under the assumption that you are
>> probably incorrect. So I took a cursory look at the code (in HEAD),
>> starting with the function comment. It says:
>> "* Note that the result is cached and must not be freed by the caller."
>
>There's also this in the body of the function:
>
> /*
> * Cache the result for re-use in later requests, if possible. If we
> * don't have a TypeInfo for the type, the string will be leaked once the
> * caller is done with it ... but that case really should not happen, so
> * leaking if it does seems acceptable.
> */
>
>Since getTypes() makes a TypeInfo for every row of pg_type, "really
>should not happen" is an accurate statement. You'd pretty much have to
>be dealing with a catalog-corruption scenario for the non-cached path
>to be taken.
>
> regards, tom lane
Hi, On Sat, Feb 19, 2022 at 04:04:38PM +0800, wliang@stu.xidian.edu.cn wrote: > However, there are also some potential leaks caused by convertTSFunction() > rather than getFormattedTypeName(). Besides, in convertTSFunction(), there > are no any comment to say that the memory should not be freed by the caller. Indeed this function doesn't cache anything and explicitly calls pg_strdup. Here the amount of leaked memory is likely to be very small (I never heard of people having thousands of text search templates or parsers), and pg_dump isn't a long lived process so probably no one thought it was worth to extra code to free that memory, which I agree with.
Re: Re: Re all: Report some potential memory leak bugs in pg_dump.c
From
wliang@stu.xidian.edu.cn
Date:
> Indeed this function doesn't cache anything and explicitly calls pg_strdup. > > Here the amount of leaked memory is likely to be very small (I never heard of > people having thousands of text search templates or parsers), and pg_dump isn't > a long lived process so probably no one thought it was worth to extra code to > free that memory, which I agree with. Thank you for your conformation. I am also grateful for your patient explanation to what I don't understand. Thank you all.
Julien Rouhaud <rjuju123@gmail.com> writes: > Here the amount of leaked memory is likely to be very small (I never heard of > people having thousands of text search templates or parsers), and pg_dump isn't > a long lived process so probably no one thought it was worth to extra code to > free that memory, which I agree with. Yeah. For context, $ valgrind pg_dump regression >/dev/null reports ==59330== HEAP SUMMARY: ==59330== in use at exit: 3,909,248 bytes in 70,831 blocks ==59330== total heap usage: 546,364 allocs, 475,533 frees, 34,377,280 bytes allocated ==59330== ==59330== LEAK SUMMARY: ==59330== definitely lost: 6 bytes in 6 blocks ==59330== indirectly lost: 0 bytes in 0 blocks ==59330== possibly lost: 0 bytes in 0 blocks ==59330== still reachable: 3,909,242 bytes in 70,825 blocks ==59330== suppressed: 0 bytes in 0 blocks So as long as you're willing to take the regression database as typical, pg_dump does not have any interesting leak problem. There are certainly things that this simple test doesn't reach --- publications/subscriptions are another area besides custom text search configurations --- but it seems unlikely that a database would have enough of any of those to justify much worry about whether those code paths leak anything. regards, tom lane