Bug: Reading from single byte character column type may cause out of bounds memory reads. - Mailing list pgsql-hackers

From Spyridon Dimitrios Agathos
Subject Bug: Reading from single byte character column type may cause out of bounds memory reads.
Date
Msg-id CAFM5RapGbBQm+dH=7K80HcvBvEWiV5Tm7N=NRaYURfm98YWc8A@mail.gmail.com
Whole thread Raw
Responses Re: Bug: Reading from single byte character column type may cause out of bounds memory reads.
List pgsql-hackers
Hi hackers,

While I was writing a test for PSQL, I faced a weird scenario. Depending on how I build PSQL (enabling or not debug options), I saw different results for the following query.

Steps to reproduce:
- OS: Ubuntu 20.04
- PSQL version 14.4

CREATE TABLE test (single_byte_col "char");
INSERT INTO test (single_byte_col) VALUES ('🀆');

In case where the following query runs in debug mode, I got this result:
SELECT left(single_byte_col, 1) l FROM test;
      l
-----------
 ~\x7F\x7F
(1 row)

Once I turn off debug mode, I got:
SELECT left(single_byte_col, 1) l FROM test;
 l
---

(1 row)

That triggered me to use Valgrind, which reported the following error:

==00:00:03:22.867 1171== VALGRINDERROR-BEGIN
==00:00:03:22.867 1171== Invalid read of size 1
==00:00:03:22.867 1171== at 0x4C41209: memmove (vg_replace_strmem.c:1382)
==00:00:03:22.868 1171== by 0xC8D8D0: text_substring (../src/backend/utils/adt/varlena.c:1050)
==00:00:03:22.868 1171== by 0xC9AC78: text_left (../src/backend/utils/adt/varlena.c:5614)
==00:00:03:22.868 1171== by 0x7FBB75: ExecInterpExpr (../src/backend/executor/execExprInterp.c:749)
==00:00:03:22.868 1171== by 0x7FADA6: ExecInterpExprStillValid (../src/backend/executor/execExprInterp.c:1824)
==00:00:03:22.868 1171== by 0x81C0AA: ExecEvalExprSwitchContext (../src/include/executor/executor.h:339)
==00:00:03:22.868 1171== by 0x81BD96: ExecProject (../src/include/executor/executor.h:373)
==00:00:03:22.868 1171== by 0x81B8BE: ExecScan (../src/backend/executor/execScan.c:238)
==00:00:03:22.868 1171== by 0x8638BA: ExecSeqScan (../src/backend/executor/nodeSeqscan.c:112)
==00:00:03:22.868 1171== by 0x8170D4: ExecProcNodeFirst (../src/backend/executor/execProcnode.c:465)
==00:00:03:22.868 1171== by 0x80F291: ExecProcNode (../src/include/executor/executor.h:257)
==00:00:03:22.868 1171== by 0x80A8F0: ExecutePlan (../src/backend/executor/execMain.c:1551)
==00:00:03:22.868 1171== by 0x80A7A3: standard_ExecutorRun (../src/backend/executor/execMain.c:361)
==00:00:03:22.868 1171== by 0x136231BC: pgss_ExecutorRun (../contrib/pg_stat_statements/pg_stat_statements.c:1001)
==00:00:03:22.868 1171== by 0x80A506: ExecutorRun (../src/backend/executor/execMain.c:303)
==00:00:03:22.868 1171== by 0xAD71B0: PortalRunSelect (../src/backend/tcop/pquery.c:921)
==00:00:03:22.868 1171== by 0xAD6B75: PortalRun (../src/backend/tcop/pquery.c:765)
==00:00:03:22.868 1171== by 0xAD1B3C: exec_simple_query (../src/backend/tcop/postgres.c:1214)
==00:00:03:22.868 1171== by 0xAD0D7F: PostgresMain (../src/backend/tcop/postgres.c:4496)
==00:00:03:22.868 1171== by 0x9D6C79: BackendRun (../src/backend/postmaster/postmaster.c:4530)
==00:00:03:22.868 1171== by 0x9D61A3: BackendStartup (../src/backend/postmaster/postmaster.c:4252)
==00:00:03:22.868 1171== by 0x9D4F56: ServerLoop (../src/backend/postmaster/postmaster.c:1745)
==00:00:03:22.868 1171== by 0x9D23A4: PostmasterMain (../src/backend/postmaster/postmaster.c:1417)
==00:00:03:22.868 1171== by 0x8AA0D2: main (../src/backend/main/main.c:209)
==00:00:03:22.868 1171== Address 0x3ed1f3c5 is 293 bytes inside a block of size 8,192 alloc'd
==00:00:03:22.868 1171== at 0x4C37135: malloc (vg_replace_malloc.c:381)
==00:00:03:22.868 1171== by 0xD1C3C0: AllocSetContextCreateInternal (../src/backend/utils/mmgr/aset.c:469)
==00:00:03:22.868 1171== by 0x821CB4: CreateExprContextInternal (../src/backend/executor/execUtils.c:253)
==00:00:03:22.868 1171== by 0x821BE2: CreateExprContext (../src/backend/executor/execUtils.c:303)
==00:00:03:22.868 1171== by 0x822078: ExecAssignExprContext (../src/backend/executor/execUtils.c:482)
==00:00:03:22.868 1171== by 0x8637D6: ExecInitSeqScan (../src/backend/executor/nodeSeqscan.c:147)
==00:00:03:22.868 1171== by 0x816B2D: ExecInitNode (../src/backend/executor/execProcnode.c:211)
==00:00:03:22.868 1171== by 0x80A336: InitPlan (../src/backend/executor/execMain.c:936)
==00:00:03:22.868 1171== by 0x809BE7: standard_ExecutorStart (../src/backend/executor/execMain.c:263)
==00:00:03:22.868 1171== by 0x1362303C: pgss_ExecutorStart (../contrib/pg_stat_statements/pg_stat_statements.c:963)
==00:00:03:22.868 1171== by 0x809881: ExecutorStart (../src/backend/executor/execMain.c:141)
==00:00:03:22.868 1171== by 0xAD636A: PortalStart (../src/backend/tcop/pquery.c:514)
==00:00:03:22.868 1171== by 0xAD1A31: exec_simple_query (../src/backend/tcop/postgres.c:1175)
==00:00:03:22.868 1171== by 0xAD0D7F: PostgresMain (../src/backend/tcop/postgres.c:4496)
==00:00:03:22.868 1171== by 0x9D6C79: BackendRun (../src/backend/postmaster/postmaster.c:4530)
==00:00:03:22.868 1171== by 0x9D61A3: BackendStartup (../src/backend/postmaster/postmaster.c:4252)
==00:00:03:22.868 1171== by 0x9D4F56: ServerLoop (../src/backend/postmaster/postmaster.c:1745)
==00:00:03:22.868 1171== by 0x9D23A4: PostmasterMain (../src/backend/postmaster/postmaster.c:1417)
==00:00:03:22.868 1171== by 0x8AA0D2: main (../src/backend/main/main.c:209)
==00:00:03:22.868 1171==
==00:00:03:22.868 1171== VALGRINDERROR-END

Then I attached a debugger to inspect the variables involved in the error reported by Valgrind for file /src/backend/utils/adt/varlena.c:

1045 for (i = S1; i < E1; i++)
1046 p += pg_mblen(p);
1047
1048 ret = (text *) palloc(VARHDRSZ + (p - s));
1049 SET_VARSIZE(ret, VARHDRSZ + (p - s));
1050 memcpy(VARDATA(ret), s, (p - s));

The column "single_byte_col" is supposed to store only 1 byte. Nevertheless, the INSERT command implicitly casts the '🀆' text into "char". This means that only the first byte of '🀆' ends up stored in the column.
gdb reports that "pg_mblen(p) = 4" (line 1046), which is expected since the pg_mblen('🀆') is indeed 4. Later at line 1050, the memcpy will copy 4 bytes instead of 1, hence an out of bounds memory read happens for pointer 's', which effectively copies random bytes.

--
Spiros
(ServiceNow)

pgsql-hackers by date:

Previous
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: Building PostgreSQL in external directory is broken?
Next
From: Aleksander Alekseev
Date:
Subject: Re: Building PostgreSQL in external directory is broken?