Thread: Plug minor memleak in pg_dump
Hi,
I noticed a minor memleak in pg_dump. ReadStr() returns a malloc'ed pointer which
should then be freed. While reading the Table of Contents, it was called as an argument
within a function call, leading to a memleak.
Please accept the attached as a proposed fix.
Cheers,
//Georgios
Attachment
On Tue, Feb 1, 2022 at 7:06 PM <gkokolatos@pm.me> wrote: > > Hi, > > I noticed a minor memleak in pg_dump. ReadStr() returns a malloc'ed pointer which > should then be freed. While reading the Table of Contents, it was called as an argument > within a function call, leading to a memleak. > > Please accept the attached as a proposed fix. +1. IMO, having "restoring tables WITH OIDS is not supported anymore" twice doesn't look good, how about as shown in [1]? [1] diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 49bf0907cd..777ff6fcfe 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -2494,6 +2494,7 @@ ReadToc(ArchiveHandle *AH) int depIdx; int depSize; TocEntry *te; + bool is_supported = true; AH->tocCount = ReadInt(AH); AH->maxDumpId = 0; @@ -2574,7 +2575,20 @@ ReadToc(ArchiveHandle *AH) te->tableam = ReadStr(AH); te->owner = ReadStr(AH); - if (AH->version < K_VERS_1_9 || strcmp(ReadStr(AH), "true") == 0) + + if (AH->version < K_VERS_1_9) + is_supported = false; + else + { + tmp = ReadStr(AH); + + if (strcmp(tmp, "true") == 0) + is_supported = false; + + pg_free(tmp); + } + + if (!is_supported) pg_log_warning("restoring tables WITH OIDS is not supported anymore"); /* Read TOC entry dependencies */ Regards, Bharath Rupireddy.
At Tue, 1 Feb 2022 19:48:01 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Tue, Feb 1, 2022 at 7:06 PM <gkokolatos@pm.me> wrote: > > > > Hi, > > > > I noticed a minor memleak in pg_dump. ReadStr() returns a malloc'ed pointer which > > should then be freed. While reading the Table of Contents, it was called as an argument > > within a function call, leading to a memleak. > > > > Please accept the attached as a proposed fix. It is freed in other temporary use of the result of ReadStr(). So freeing it sounds sensible at a glance. > +1. IMO, having "restoring tables WITH OIDS is not supported anymore" > twice doesn't look good, how about as shown in [1]? Maybe [2] is smaller :) --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -2494,6 +2494,7 @@ ReadToc(ArchiveHandle *AH) int depIdx; int depSize; TocEntry *te; + char *tmpstr = NULL; AH->tocCount = ReadInt(AH); AH->maxDumpId = 0; @@ -2574,8 +2575,14 @@ ReadToc(ArchiveHandle *AH) te->tableam = ReadStr(AH); te->owner = ReadStr(AH); - if (AH->version < K_VERS_1_9 || strcmp(ReadStr(AH), "true") == 0) + if (AH->version < K_VERS_1_9 || + strcmp((tmpstr = ReadStr(AH)), "true") == 0) pg_log_warning("restoring tables WITH OIDS is not supported anymore"); + if (tmpstr) + { + pg_free(tmpstr); + tmpstr = NULL; + } /* Read TOC entry dependencies */ Thus.. I came to doubt of its worthiness to the complexity. The amount of the leak is (perhaps) negligible. So, I would just write a comment there. +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -2574,6 +2574,8 @@ ReadToc(ArchiveHandle *AH) te->tableam = ReadStr(AH); te->owner = ReadStr(AH); + + /* deliberately leak the result of ReadStr for simplicity */ if (AH->version < K_VERS_1_9 || strcmp(ReadStr(AH), "true") == 0) pg_log_warning("restoring tables WITH OIDS is not supported anymore"); regards. -- Kyotaro Horiguchi NTT Open Source Software Center
> On 2 Feb 2022, at 09:29, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Tue, 1 Feb 2022 19:48:01 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in >> On Tue, Feb 1, 2022 at 7:06 PM <gkokolatos@pm.me> wrote: >>> >>> Hi, >>> >>> I noticed a minor memleak in pg_dump. ReadStr() returns a malloc'ed pointer which >>> should then be freed. While reading the Table of Contents, it was called as an argument >>> within a function call, leading to a memleak. >>> >>> Please accept the attached as a proposed fix. > > It is freed in other temporary use of the result of ReadStr(). So > freeing it sounds sensible at a glance. > >> +1. IMO, having "restoring tables WITH OIDS is not supported anymore" >> twice doesn't look good, how about as shown in [1]? > > Maybe [2] is smaller :) It is smaller, but I think Bharath's version wins in terms of readability. > Thus.. I came to doubt of its worthiness to the complexity. The > amount of the leak is (perhaps) negligible. > > So, I would just write a comment there. The leak itself is clearly not something to worry about wrt memory pressure. We do read into tmp and free it in other places in the same function though (as you note above), so for code consistency alone this is worth doing IMO (and it reduces the risk of static analyzers flagging this). Unless objected to I will go ahead with getting this committed. -- Daniel Gustafsson https://vmware.com/
On Wed, Feb 02, 2022 at 10:06:13AM +0100, Daniel Gustafsson wrote: > The leak itself is clearly not something to worry about wrt memory pressure. > We do read into tmp and free it in other places in the same function though (as > you note above), so for code consistency alone this is worth doing IMO (and it > reduces the risk of static analyzers flagging this). > > Unless objected to I will go ahead with getting this committed. Looks like you forgot to apply that? -- Michael
Attachment
On Wed, Feb 9, 2022 at 8:26 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Feb 02, 2022 at 10:06:13AM +0100, Daniel Gustafsson wrote: > > The leak itself is clearly not something to worry about wrt memory pressure. > > We do read into tmp and free it in other places in the same function though (as > > you note above), so for code consistency alone this is worth doing IMO (and it > > reduces the risk of static analyzers flagging this). > > > > Unless objected to I will go ahead with getting this committed. > > Looks like you forgot to apply that? Attaching the patch that I suggested above, also the original patch proposed by Georgios is at [1], leaving the decision to the committer to pick up the best one. [1] https://www.postgresql.org/message-id/oZwKiUxFsVaetG2xOJp7Hwao8F1AKIdfFDQLNJrnwoaxmjyB-45r_aYmhgXHKLcMI3GT24m9L6HafSi2ns7WFxXe0mw2_tIJpD-Z3vb_eyI%3D%40pm.me Regards, Bharath Rupireddy.
Attachment
> On 9 Feb 2022, at 03:56, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Feb 02, 2022 at 10:06:13AM +0100, Daniel Gustafsson wrote: >> The leak itself is clearly not something to worry about wrt memory pressure. >> We do read into tmp and free it in other places in the same function though (as >> you note above), so for code consistency alone this is worth doing IMO (and it >> reduces the risk of static analyzers flagging this). >> >> Unless objected to I will go ahead with getting this committed. > > Looks like you forgot to apply that? No, but I was distracted by other things leaving this on the TODO list. It's been pushed now. -- Daniel Gustafsson https://vmware.com/
>No, but I was distracted by other things leaving this on the TODO list. It's
>been pushed now.
Hi,
IMO I think that still have troubles here.
ReadStr can return NULL, so the fix can crash.
regards,
Ranier Vilela
Attachment
On Wed, Feb 09, 2022 at 02:48:35PM -0300, Ranier Vilela wrote: > IMO I think that still have troubles here. > > ReadStr can return NULL, so the fix can crash. - sscanf(tmp, "%u", &te->catalogId.tableoid); - free(tmp); + if (tmp) + { + sscanf(tmp, "%u", &te->catalogId.tableoid); + free(tmp); + } + else + te->catalogId.tableoid = InvalidOid; This patch makes things worse, doesn't it? Doesn't this localized change mean that we expose ourselves more into *ignoring* TOC entries if we mess up with this code in the future? That sounds particularly sensible if you have a couple of bytes corrupted in a dump. -- Michael
Attachment
Em qua., 9 de fev. de 2022 às 23:16, Michael Paquier <michael@paquier.xyz> escreveu:
On Wed, Feb 09, 2022 at 02:48:35PM -0300, Ranier Vilela wrote:
> IMO I think that still have troubles here.
>
> ReadStr can return NULL, so the fix can crash.
- sscanf(tmp, "%u", &te->catalogId.tableoid);
- free(tmp);
+ if (tmp)
+ {
+ sscanf(tmp, "%u", &te->catalogId.tableoid);
+ free(tmp);
+ }
+ else
+ te->catalogId.tableoid = InvalidOid;
This patch makes things worse, doesn't it?
No.
Doesn't this localized
change mean that we expose ourselves more into *ignoring* TOC entries
if we mess up with this code in the future?
InvalidOid already used for "default".
If ReadStr fails and returns NULL, sscanf will crash.
Maybe in this case, better report to the user?
pg_log_warning?
regards,
Ranier Vilela
Em qui., 10 de fev. de 2022 às 08:14, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em qua., 9 de fev. de 2022 às 23:16, Michael Paquier <michael@paquier.xyz> escreveu:On Wed, Feb 09, 2022 at 02:48:35PM -0300, Ranier Vilela wrote:
> IMO I think that still have troubles here.
>
> ReadStr can return NULL, so the fix can crash.
- sscanf(tmp, "%u", &te->catalogId.tableoid);
- free(tmp);
+ if (tmp)
+ {
+ sscanf(tmp, "%u", &te->catalogId.tableoid);
+ free(tmp);
+ }
+ else
+ te->catalogId.tableoid = InvalidOid;
This patch makes things worse, doesn't it?No.Doesn't this localized
change mean that we expose ourselves more into *ignoring* TOC entries
if we mess up with this code in the future?InvalidOid already used for "default".If ReadStr fails and returns NULL, sscanf will crash.Maybe in this case, better report to the user?pg_log_warning?
Maybe in this case, the right thing is abort?
See v2, please.
regards,
Ranier Vilela
Attachment
> On 10 Feb 2022, at 12:14, Ranier Vilela <ranier.vf@gmail.com> wrote: > Em qua., 9 de fev. de 2022 às 23:16, Michael Paquier <michael@paquier.xyz <mailto:michael@paquier.xyz>> escreveu: > This patch makes things worse, doesn't it? > No. > > Doesn't this localized > change mean that we expose ourselves more into *ignoring* TOC entries > if we mess up with this code in the future? > InvalidOid already used for "default". There is no default case here, setting the tableoid to InvalidOid is done when the archive doesn't support this particular feature. If we can't read the tableoid here, it's a corrupt TOC and we should abort. > If ReadStr fails and returns NULL, sscanf will crash. Yes, which is better than silently propage the error. > Maybe in this case, better report to the user? > pg_log_warning? That would demote what is today a crash to a warning on a corrupt TOC entry, which I think is the wrong way to go. Question is, can this fail in a non-synthetic case on output which was successfully generated by pg_dump? I'm not saying we should ignore errors, but I have a feeling that any input fed that triggers this will be broken enough to cause fireworks elsewhere too, and this being a chase towards low returns apart from complicating the code. -- Daniel Gustafsson https://vmware.com/
Em qui., 10 de fev. de 2022 às 10:57, Daniel Gustafsson <daniel@yesql.se> escreveu:
> On 10 Feb 2022, at 12:14, Ranier Vilela <ranier.vf@gmail.com> wrote:
> Em qua., 9 de fev. de 2022 às 23:16, Michael Paquier <michael@paquier.xyz <mailto:michael@paquier.xyz>> escreveu:
> This patch makes things worse, doesn't it?
> No.
>
> Doesn't this localized
> change mean that we expose ourselves more into *ignoring* TOC entries
> if we mess up with this code in the future?
> InvalidOid already used for "default".
There is no default case here, setting the tableoid to InvalidOid is done when
the archive doesn't support this particular feature. If we can't read the
tableoid here, it's a corrupt TOC and we should abort.
Well, the v2 aborts.
> If ReadStr fails and returns NULL, sscanf will crash.
Yes, which is better than silently propage the error.
Ok, silently propagating the error is bad, but crashing is a signal of poor tool.
> Maybe in this case, better report to the user?
> pg_log_warning?
That would demote what is today a crash to a warning on a corrupt TOC entry,
which I think is the wrong way to go. Question is, can this fail in a
non-synthetic case on output which was successfully generated by pg_dump? I'm
not saying we should ignore errors, but I have a feeling that any input fed
that triggers this will be broken enough to cause fireworks elsewhere too, and
this being a chase towards low returns apart from complicating the code.
For me the code stays more simple and maintainable.
regards,
Ranier Vilela