Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c) - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c)
Date
Msg-id 20220217.155126.649675036261400698.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c)  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c)  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c)  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
At Thu, 17 Feb 2022 14:58:38 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> (Sorry for the broken mail...)
> 
> At Wed, 16 Feb 2022 09:29:20 -0300, Ranier Vilela <ranier.vf@gmail.com> wrote in 
> > > The patch:
> > >  pg_encoding_max_length(int encoding)
> > >  {
> > > -       Assert(PG_VALID_ENCODING(encoding));
> > > -
> > > -       return pg_wchar_table[encoding].maxmblen;
> > > +       if (PG_VALID_ENCODING(encoding))
> > > +               return pg_wchar_table[encoding].maxmblen;
> > > +       else
> > > +               return -1;
> > >
> > > Returning -1 for invalid encoding is, I think, flat wrong.
> > >
> > Ok, if -1 is wrong, what should the value of return if
> > somebody calling this function like:
> > pg_encoding_max_length(63);
> 
> Should result in assertion failure, I think.  If that fails, the
> caller side is anyhow broken.  On the other hand we haven't had a
> complain about that, maybe.
> 
> > Of course, with patch applied, because with original code
> > has memory corruption, if built and run without DEBUG.
> 
> So we don't assume corruption in production build.  It should be
> logically guaranteed.
> 
> I'll dig into that further.

The number comes from pg_char_to_encoding, which is the internal
function of the SQL function PG_char_encoding(). It looks fine but I
can confirm that for all possible encoding names in pg_encname_tbl[].

=# select * from (
select e as name, PG_char_to_encoding(e) enc
from unnest(array['abc', 'alt', 'big5', 'euccn', 'eucjis2004', 'eucjp',
    'euckr', 'euctw', 'gb18030', 'gbk', 'iso88591', 'iso885910', 'iso885913',
    'iso885914', 'iso885915', 'iso885916', 'iso88592', 'iso88593', 'iso88594',
    'iso88595', 'iso88596', 'iso88597', 'iso88598', 'iso88599', 'johab',
    'koi8', 'koi8r', 'koi8u', 'latin1', 'latin10', 'latin2', 'latin3',
    'latin4', 'latin5', 'latin6', 'latin7', 'latin8', 'latin9', 'mskanji',
    'muleinternal', 'shiftjis', 'shiftjis2004', 'sjis', 'sqlascii', 'tcvn',
    'tcvn5712', 'uhc', 'unicode', 'utf8', 'vscii', 'win', 'win1250', 'win1251',
    'win1252', 'win1253', 'win1254', 'win1255', 'win1256', 'win1257',
    'win1258', 'win866', 'win874', 'win932', 'win936', 'win949', 'win950',
    'windows1250', 'windows1251', 'windows1252', 'windows1253', 'windows1254',
    'windows1255', 'windows1256', 'windows1257', 'windows1258', 'windows866',
    'windows874', 'windows932', 'windows936', 'windows949', 'windows950',
    'hoge']) as e) as t where enc < 0 or enc > 41;
 name | enc 
------+-----
 hoge |  -1
(1 row)

So, the function doesn't return 63 for all registered names and wrong
names.

So other possibilities I can think of are..

- Someone had broken pg_encname_tbl[]

- Cosmic ray hit, or ill memory cell.

- Coverity worked wrong way.


Could you show the workload for the Coverity warning here?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication
Next
From: Andres Freund
Date:
Subject: Re: Time to drop plpython2?