Thread: Re: Avoid possible overflow (src/port/bsearch_arg.c)
On Tue, Oct 08, 2024 at 04:09:00PM -0300, Ranier Vilela wrote: > The port function *bsearch_arg* mimics the C function > *bsearch*. > > The API signature is: > void * > bsearch_arg(const void *key, const void *base0, > size_t nmemb, size_t size, > int (*compar) (const void *, const void *, void *), > void *arg) > > So, the parameter *nmemb* is size_t. > Therefore, a call with nmemb greater than INT_MAX is possible. > > Internally the code uses the *int* type to iterate through the number of > members, which makes overflow possible. I traced this back to commit bfa2cee (v14), which both moved bsearch_arg() to its current location and adjusted the style a bit. Your patch looks reasonable to me. -- nathan
Em ter., 8 de out. de 2024 às 18:28, Nathan Bossart <nathandbossart@gmail.com> escreveu:
On Tue, Oct 08, 2024 at 04:09:00PM -0300, Ranier Vilela wrote:
> The port function *bsearch_arg* mimics the C function
> *bsearch*.
>
> The API signature is:
> void *
> bsearch_arg(const void *key, const void *base0,
> size_t nmemb, size_t size,
> int (*compar) (const void *, const void *, void *),
> void *arg)
>
> So, the parameter *nmemb* is size_t.
> Therefore, a call with nmemb greater than INT_MAX is possible.
>
> Internally the code uses the *int* type to iterate through the number of
> members, which makes overflow possible.
I traced this back to commit bfa2cee (v14), which both moved bsearch_arg()
to its current location and adjusted the style a bit. Your patch looks
reasonable to me.
Thanks for looking.
best regards,
Ranier Vilela
On 09/10/2024 19:16, Ranier Vilela wrote: > Em ter., 8 de out. de 2024 às 18:28, Nathan Bossart > <nathandbossart@gmail.com <mailto:nathandbossart@gmail.com>> escreveu: > > On Tue, Oct 08, 2024 at 04:09:00PM -0300, Ranier Vilela wrote: > > The port function *bsearch_arg* mimics the C function > > *bsearch*. > > > > The API signature is: > > void * > > bsearch_arg(const void *key, const void *base0, > > size_t nmemb, size_t size, > > int (*compar) (const void *, const void *, void *), > > void *arg) > > > > So, the parameter *nmemb* is size_t. > > Therefore, a call with nmemb greater than INT_MAX is possible. > > > > Internally the code uses the *int* type to iterate through the > number of > > members, which makes overflow possible. > > I traced this back to commit bfa2cee (v14), which both moved > bsearch_arg() > to its current location and adjusted the style a bit. Your patch looks > reasonable to me. > > Thanks for looking. Committed, thanks. Based on the original discussion on bfa2cee, I couldn't figure out where exactly this new bsearch implementation originated from, but googling around, probably *BSD or libiberty. Tomas, do you remember? Not that it matters, but I'm curious. Some of those other implementations have fixed this, others have not. And they all seem to also have the "involes" typo in the comment that we fixed in commit 7ef8b52cf07 :-). Ranier, you might want to submit this fix to those other projects too. -- Heikki Linnakangas Neon (https://neon.tech)
Em seg., 28 de out. de 2024 às 09:13, Heikki Linnakangas <hlinnaka@iki.fi> escreveu:
On 09/10/2024 19:16, Ranier Vilela wrote:
> Em ter., 8 de out. de 2024 às 18:28, Nathan Bossart
> <nathandbossart@gmail.com <mailto:nathandbossart@gmail.com>> escreveu:
>
> On Tue, Oct 08, 2024 at 04:09:00PM -0300, Ranier Vilela wrote:
> > The port function *bsearch_arg* mimics the C function
> > *bsearch*.
> >
> > The API signature is:
> > void *
> > bsearch_arg(const void *key, const void *base0,
> > size_t nmemb, size_t size,
> > int (*compar) (const void *, const void *, void *),
> > void *arg)
> >
> > So, the parameter *nmemb* is size_t.
> > Therefore, a call with nmemb greater than INT_MAX is possible.
> >
> > Internally the code uses the *int* type to iterate through the
> number of
> > members, which makes overflow possible.
>
> I traced this back to commit bfa2cee (v14), which both moved
> bsearch_arg()
> to its current location and adjusted the style a bit. Your patch looks
> reasonable to me.
>
> Thanks for looking.
Committed, thanks.
Thank you.
Based on the original discussion on bfa2cee, I couldn't figure out where
exactly this new bsearch implementation originated from, but googling
around, probably *BSD or libiberty. Tomas, do you remember? Not that it
matters, but I'm curious.
Some of those other implementations have fixed this, others have not.
And they all seem to also have the "involes" typo in the comment that we
fixed in commit 7ef8b52cf07 :-). Ranier, you might want to submit this
fix to those other projects too.
Sure, I can try.
best regards,
Ranier Vilela
On 10/28/24 13:13, Heikki Linnakangas wrote: > On 09/10/2024 19:16, Ranier Vilela wrote: >> Em ter., 8 de out. de 2024 às 18:28, Nathan Bossart >> <nathandbossart@gmail.com <mailto:nathandbossart@gmail.com>> escreveu: >> >> On Tue, Oct 08, 2024 at 04:09:00PM -0300, Ranier Vilela wrote: >> > The port function *bsearch_arg* mimics the C function >> > *bsearch*. >> > >> > The API signature is: >> > void * >> > bsearch_arg(const void *key, const void *base0, >> > size_t nmemb, size_t size, >> > int (*compar) (const void *, const void *, void *), >> > void *arg) >> > >> > So, the parameter *nmemb* is size_t. >> > Therefore, a call with nmemb greater than INT_MAX is possible. >> > >> > Internally the code uses the *int* type to iterate through the >> number of >> > members, which makes overflow possible. >> >> I traced this back to commit bfa2cee (v14), which both moved >> bsearch_arg() >> to its current location and adjusted the style a bit. Your patch >> looks >> reasonable to me. >> >> Thanks for looking. > > Committed, thanks. > > Based on the original discussion on bfa2cee, I couldn't figure out where > exactly this new bsearch implementation originated from, but googling > around, probably *BSD or libiberty. Tomas, do you remember? Not that it > matters, but I'm curious. > I don't remember, unfortunately :-( I think it was one of the *BSDs, because of license, but I'm not quite sure why I changed the code at all during the move. > Some of those other implementations have fixed this, others have not. > And they all seem to also have the "involes" typo in the comment that we > fixed in commit 7ef8b52cf07 :-). Ranier, you might want to submit this > fix to those other projects too. > Thanks for fixing this, although I wonder if we can actually hit this, as we don't really allocate more than 1GB in most places. But it's possible, and the pre-bfa2cee code handled it fine. regards -- Tomas Vondra
On 28/10/2024 17:30, Tomas Vondra wrote: >> Some of those other implementations have fixed this, others have not. >> And they all seem to also have the "involes" typo in the comment that we >> fixed in commit 7ef8b52cf07 :-). Ranier, you might want to submit this >> fix to those other projects too. > > Thanks for fixing this, although I wonder if we can actually hit this, > as we don't really allocate more than 1GB in most places. But it's > possible, and the pre-bfa2cee code handled it fine. Yeah, I didn't check closely I'm pretty sure none of the current callsites can pass anything near INT_MAX elements. While we're at it, there's this in dicts/spell.h: > /* > * Structure to store Hunspell options. Flag representation depends on flag > * type. These flags are about support of compound words. > */ > typedef struct CompoundAffixFlag > { > union > { > /* Flag name if flagMode is FM_CHAR or FM_LONG */ > const char *s; > /* Flag name if flagMode is FM_NUM */ > uint32 i; > } flag; > /* we don't have a bsearch_arg version, so, copy FlagMode */ > FlagMode flagMode; > uint32 value; > } CompoundAffixFlag; We have bsearch_arg() now, so we could switch to that and remove 'flagMode' from here. -- Heikki Linnakangas Neon (https://neon.tech)