Re: Bug: Reading from single byte character column type may cause out of bounds memory reads. - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.
Date
Msg-id CAJ7c6TNguAKGcYxk=SHGvGAGfdrx_2DP5bvtDNu+eQ+3kDbt7w@mail.gmail.com
Whole thread Raw
In response to Bug: Reading from single byte character column type may cause out of bounds memory reads.  (Spyridon Dimitrios Agathos <spyridon.dimitrios.agathos@gmail.com>)
Responses Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.
Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.
Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.
List pgsql-hackers
Hi Spyridon,

> The column "single_byte_col" is supposed to store only 1 byte. Nevertheless, the INSERT command implicitly casts the
'🀆'text into "char". This means that only the first byte of '🀆' ends up stored in the column. 
> gdb reports that "pg_mblen(p) = 4" (line 1046), which is expected since the pg_mblen('🀆') is indeed 4. Later at line
1050,the memcpy will copy 4 bytes instead of 1, hence an out of bounds memory read happens for pointer 's', which
effectivelycopies random bytes. 

Many thanks for reporting this!

> - OS: Ubuntu 20.04
> - PSQL version 14.4

I can confirm the bug exists in the `master` branch as well and
doesn't depend on the platform.

Although the bug is easy to fix for this particular case (see the
patch) I'm not sure if this solution is general enough. E.g. is there
something that generally prevents pg_mblen() from doing out of bound
reading in cases similar to this one? Should we prevent such an INSERT
from happening instead?

--
Best regards,
Aleksander Alekseev

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Building PostgreSQL in external directory is broken?
Next
From: "Jonathan S. Katz"
Date:
Subject: Re: PG15 beta1 sort performance regression due to Generation context change