Thread: BUG #16527: Valgrind detects an invalid read in brin_revmap_data with non-index page

BUG #16527: Valgrind detects an invalid read in brin_revmap_data with non-index page

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16527
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 13beta2
Operating system:   Ubuntu 20.04
Description:

Running the following query (based on contrib/pageinspect/sql/brin.sql)
under valgrind:
CREATE EXTENSION pageinspect;
CREATE TABLE test1 (a int, b text);
INSERT INTO test1 VALUES (1, 'one');
SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));

leads to a memory access error:
==00:00:00:12.518 934833== Invalid read of size 2
==00:00:00:12.518 934833==    at 0x4865AE1: verify_brin_page
(brinfuncs.c:107)
==00:00:00:12.518 934833==    by 0x486674E: brin_revmap_data
(brinfuncs.c:386)
==00:00:00:12.518 934833==    by 0x3C9656: ExecMakeTableFunctionResult
(execSRF.c:234)
==00:00:00:12.518 934833==    by 0x3DB7D4: FunctionNext
(nodeFunctionscan.c:95)
==00:00:00:12.518 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
==00:00:00:12.518 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
==00:00:00:12.518 934833==    by 0x3DB6DF: ExecFunctionScan
(nodeFunctionscan.c:270)
==00:00:00:12.518 934833==    by 0x3C70B2: ExecProcNodeFirst
(execProcnode.c:450)
==00:00:00:12.518 934833==    by 0x3BFDD3: ExecProcNode (executor.h:245)
==00:00:00:12.518 934833==    by 0x3BFDD3: ExecutePlan (execMain.c:1646)
==00:00:00:12.518 934833==    by 0x3BFFB3: standard_ExecutorRun
(execMain.c:364)
==00:00:00:12.518 934833==    by 0x3C007F: ExecutorRun (execMain.c:308)
==00:00:00:12.518 934833==    by 0x55F21F: PortalRunSelect (pquery.c:912)
==00:00:00:12.518 934833==  Address 0xe69cc0a is 2 bytes after a block of
size 8,264 alloc'd
==00:00:00:12.518 934833==    at 0x483B7F3: malloc (in
/usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:00:12.518 934833==    by 0x6A94CE: AllocSetAlloc (aset.c:739)
==00:00:00:12.518 934833==    by 0x6B2AA7: palloc (mcxt.c:963)
==00:00:00:12.518 934833==    by 0x486B838: get_raw_page_internal
(rawpage.c:154)
==00:00:00:12.518 934833==    by 0x486BC35: get_raw_page (rawpage.c:62)
==00:00:00:12.518 934833==    by 0x3BACBE: ExecInterpExpr
(execExprInterp.c:699)
==00:00:00:12.518 934833==    by 0x3B7A64: ExecInterpExprStillValid
(execExprInterp.c:1802)
==00:00:00:12.518 934833==    by 0x3C8C3B: ExecEvalExpr (executor.h:294)
==00:00:00:12.518 934833==    by 0x3C8C3B: ExecEvalFuncArgs
(execSRF.c:836)
==00:00:00:12.518 934833==    by 0x3C95C8: ExecMakeTableFunctionResult
(execSRF.c:181)
==00:00:00:12.518 934833==    by 0x3DB7D4: FunctionNext
(nodeFunctionscan.c:95)
==00:00:00:12.518 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
==00:00:00:12.518 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
==00:00:00:12.518 934833== 
{
   <insert_a_suppression_name_here>
   Memcheck:Addr2
   fun:verify_brin_page
   fun:brin_revmap_data
   fun:ExecMakeTableFunctionResult
   fun:FunctionNext
   fun:ExecScanFetch
   fun:ExecScan
   fun:ExecFunctionScan
   fun:ExecProcNodeFirst
   fun:ExecProcNode
   fun:ExecutePlan
   fun:standard_ExecutorRun
   fun:ExecutorRun
   fun:PortalRunSelect
}
==00:00:00:12.519 934833== Invalid read of size 2
==00:00:00:12.519 934833==    at 0x4865C07: verify_brin_page
(brinfuncs.c:108)
==00:00:00:12.519 934833==    by 0x486674E: brin_revmap_data
(brinfuncs.c:386)
==00:00:00:12.519 934833==    by 0x3C9656: ExecMakeTableFunctionResult
(execSRF.c:234)
==00:00:00:12.519 934833==    by 0x3DB7D4: FunctionNext
(nodeFunctionscan.c:95)
==00:00:00:12.519 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
==00:00:00:12.519 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
==00:00:00:12.519 934833==    by 0x3DB6DF: ExecFunctionScan
(nodeFunctionscan.c:270)
==00:00:00:12.519 934833==    by 0x3C70B2: ExecProcNodeFirst
(execProcnode.c:450)
==00:00:00:12.519 934833==    by 0x3BFDD3: ExecProcNode (executor.h:245)
==00:00:00:12.519 934833==    by 0x3BFDD3: ExecutePlan (execMain.c:1646)
==00:00:00:12.519 934833==    by 0x3BFFB3: standard_ExecutorRun
(execMain.c:364)
==00:00:00:12.519 934833==    by 0x3C007F: ExecutorRun (execMain.c:308)
==00:00:00:12.519 934833==    by 0x55F21F: PortalRunSelect (pquery.c:912)
==00:00:00:12.519 934833==  Address 0xe69cc0a is 2 bytes after a block of
size 8,264 alloc'd
==00:00:00:12.519 934833==    at 0x483B7F3: malloc (in
/usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:00:12.519 934833==    by 0x6A94CE: AllocSetAlloc (aset.c:739)
==00:00:00:12.519 934833==    by 0x6B2AA7: palloc (mcxt.c:963)
==00:00:00:12.519 934833==    by 0x486B838: get_raw_page_internal
(rawpage.c:154)
==00:00:00:12.519 934833==    by 0x486BC35: get_raw_page (rawpage.c:62)
==00:00:00:12.519 934833==    by 0x3BACBE: ExecInterpExpr
(execExprInterp.c:699)
==00:00:00:12.519 934833==    by 0x3B7A64: ExecInterpExprStillValid
(execExprInterp.c:1802)
==00:00:00:12.519 934833==    by 0x3C8C3B: ExecEvalExpr (executor.h:294)
==00:00:00:12.519 934833==    by 0x3C8C3B: ExecEvalFuncArgs
(execSRF.c:836)
==00:00:00:12.519 934833==    by 0x3C95C8: ExecMakeTableFunctionResult
(execSRF.c:181)
==00:00:00:12.519 934833==    by 0x3DB7D4: FunctionNext
(nodeFunctionscan.c:95)
==00:00:00:12.519 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
==00:00:00:12.519 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
==00:00:00:12.519 934833== 
{
   <insert_a_suppression_name_here>
   Memcheck:Addr2
   fun:verify_brin_page
   fun:brin_revmap_data
   fun:ExecMakeTableFunctionResult
   fun:FunctionNext
   fun:ExecScanFetch
   fun:ExecScan
   fun:ExecFunctionScan
   fun:ExecProcNodeFirst
   fun:ExecProcNode
   fun:ExecutePlan
   fun:standard_ExecutorRun
   fun:ExecutorRun
   fun:PortalRunSelect
}
2020-07-04 17:57:55.915 MSK [934833] ERROR:  page is not a BRIN page of type
"revmap"
2020-07-04 17:57:55.915 MSK [934833] DETAIL:  Expected special type
0000f092, got 00007f7f.
2020-07-04 17:57:55.915 MSK [934833] STATEMENT:  SELECT * FROM
brin_revmap_data(get_raw_page('test1', 0));

Reproduced on REL_10_STABLE..master.


On Sat, Jul 04, 2020 at 04:00:00PM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference:      16527
>Logged by:          Alexander Lakhin
>Email address:      exclusion@gmail.com
>PostgreSQL version: 13beta2
>Operating system:   Ubuntu 20.04
>Description:
>
>Running the following query (based on contrib/pageinspect/sql/brin.sql)
>under valgrind:
>CREATE EXTENSION pageinspect;
>CREATE TABLE test1 (a int, b text);
>INSERT INTO test1 VALUES (1, 'one');
>SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));
>
>leads to a memory access error:
>==00:00:00:12.518 934833== Invalid read of size 2
>==00:00:00:12.518 934833==    at 0x4865AE1: verify_brin_page
>(brinfuncs.c:107)
>==00:00:00:12.518 934833==    by 0x486674E: brin_revmap_data
>(brinfuncs.c:386)
>==00:00:00:12.518 934833==    by 0x3C9656: ExecMakeTableFunctionResult
>(execSRF.c:234)
>==00:00:00:12.518 934833==    by 0x3DB7D4: FunctionNext
>(nodeFunctionscan.c:95)
>==00:00:00:12.518 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
>==00:00:00:12.518 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
>==00:00:00:12.518 934833==    by 0x3DB6DF: ExecFunctionScan
>(nodeFunctionscan.c:270)
>==00:00:00:12.518 934833==    by 0x3C70B2: ExecProcNodeFirst
>(execProcnode.c:450)
>==00:00:00:12.518 934833==    by 0x3BFDD3: ExecProcNode (executor.h:245)
>==00:00:00:12.518 934833==    by 0x3BFDD3: ExecutePlan (execMain.c:1646)
>==00:00:00:12.518 934833==    by 0x3BFFB3: standard_ExecutorRun
>(execMain.c:364)
>==00:00:00:12.518 934833==    by 0x3C007F: ExecutorRun (execMain.c:308)
>==00:00:00:12.518 934833==    by 0x55F21F: PortalRunSelect (pquery.c:912)
>==00:00:00:12.518 934833==  Address 0xe69cc0a is 2 bytes after a block of
>size 8,264 alloc'd
>==00:00:00:12.518 934833==    at 0x483B7F3: malloc (in
>/usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
>==00:00:00:12.518 934833==    by 0x6A94CE: AllocSetAlloc (aset.c:739)
>==00:00:00:12.518 934833==    by 0x6B2AA7: palloc (mcxt.c:963)
>==00:00:00:12.518 934833==    by 0x486B838: get_raw_page_internal
>(rawpage.c:154)
>==00:00:00:12.518 934833==    by 0x486BC35: get_raw_page (rawpage.c:62)
>==00:00:00:12.518 934833==    by 0x3BACBE: ExecInterpExpr
>(execExprInterp.c:699)
>==00:00:00:12.518 934833==    by 0x3B7A64: ExecInterpExprStillValid
>(execExprInterp.c:1802)
>==00:00:00:12.518 934833==    by 0x3C8C3B: ExecEvalExpr (executor.h:294)
>==00:00:00:12.518 934833==    by 0x3C8C3B: ExecEvalFuncArgs
>(execSRF.c:836)
>==00:00:00:12.518 934833==    by 0x3C95C8: ExecMakeTableFunctionResult
>(execSRF.c:181)
>==00:00:00:12.518 934833==    by 0x3DB7D4: FunctionNext
>(nodeFunctionscan.c:95)
>==00:00:00:12.518 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
>==00:00:00:12.518 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
>==00:00:00:12.518 934833==
>{
>   <insert_a_suppression_name_here>
>   Memcheck:Addr2
>   fun:verify_brin_page
>   fun:brin_revmap_data
>   fun:ExecMakeTableFunctionResult
>   fun:FunctionNext
>   fun:ExecScanFetch
>   fun:ExecScan
>   fun:ExecFunctionScan
>   fun:ExecProcNodeFirst
>   fun:ExecProcNode
>   fun:ExecutePlan
>   fun:standard_ExecutorRun
>   fun:ExecutorRun
>   fun:PortalRunSelect
>}
>==00:00:00:12.519 934833== Invalid read of size 2
>==00:00:00:12.519 934833==    at 0x4865C07: verify_brin_page
>(brinfuncs.c:108)
>==00:00:00:12.519 934833==    by 0x486674E: brin_revmap_data
>(brinfuncs.c:386)
>==00:00:00:12.519 934833==    by 0x3C9656: ExecMakeTableFunctionResult
>(execSRF.c:234)
>==00:00:00:12.519 934833==    by 0x3DB7D4: FunctionNext
>(nodeFunctionscan.c:95)
>==00:00:00:12.519 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
>==00:00:00:12.519 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
>==00:00:00:12.519 934833==    by 0x3DB6DF: ExecFunctionScan
>(nodeFunctionscan.c:270)
>==00:00:00:12.519 934833==    by 0x3C70B2: ExecProcNodeFirst
>(execProcnode.c:450)
>==00:00:00:12.519 934833==    by 0x3BFDD3: ExecProcNode (executor.h:245)
>==00:00:00:12.519 934833==    by 0x3BFDD3: ExecutePlan (execMain.c:1646)
>==00:00:00:12.519 934833==    by 0x3BFFB3: standard_ExecutorRun
>(execMain.c:364)
>==00:00:00:12.519 934833==    by 0x3C007F: ExecutorRun (execMain.c:308)
>==00:00:00:12.519 934833==    by 0x55F21F: PortalRunSelect (pquery.c:912)
>==00:00:00:12.519 934833==  Address 0xe69cc0a is 2 bytes after a block of
>size 8,264 alloc'd
>==00:00:00:12.519 934833==    at 0x483B7F3: malloc (in
>/usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
>==00:00:00:12.519 934833==    by 0x6A94CE: AllocSetAlloc (aset.c:739)
>==00:00:00:12.519 934833==    by 0x6B2AA7: palloc (mcxt.c:963)
>==00:00:00:12.519 934833==    by 0x486B838: get_raw_page_internal
>(rawpage.c:154)
>==00:00:00:12.519 934833==    by 0x486BC35: get_raw_page (rawpage.c:62)
>==00:00:00:12.519 934833==    by 0x3BACBE: ExecInterpExpr
>(execExprInterp.c:699)
>==00:00:00:12.519 934833==    by 0x3B7A64: ExecInterpExprStillValid
>(execExprInterp.c:1802)
>==00:00:00:12.519 934833==    by 0x3C8C3B: ExecEvalExpr (executor.h:294)
>==00:00:00:12.519 934833==    by 0x3C8C3B: ExecEvalFuncArgs
>(execSRF.c:836)
>==00:00:00:12.519 934833==    by 0x3C95C8: ExecMakeTableFunctionResult
>(execSRF.c:181)
>==00:00:00:12.519 934833==    by 0x3DB7D4: FunctionNext
>(nodeFunctionscan.c:95)
>==00:00:00:12.519 934833==    by 0x3CA059: ExecScanFetch (execScan.c:133)
>==00:00:00:12.519 934833==    by 0x3CA0F4: ExecScan (execScan.c:182)
>==00:00:00:12.519 934833==
>{
>   <insert_a_suppression_name_here>
>   Memcheck:Addr2
>   fun:verify_brin_page
>   fun:brin_revmap_data
>   fun:ExecMakeTableFunctionResult
>   fun:FunctionNext
>   fun:ExecScanFetch
>   fun:ExecScan
>   fun:ExecFunctionScan
>   fun:ExecProcNodeFirst
>   fun:ExecProcNode
>   fun:ExecutePlan
>   fun:standard_ExecutorRun
>   fun:ExecutorRun
>   fun:PortalRunSelect
>}
>2020-07-04 17:57:55.915 MSK [934833] ERROR:  page is not a BRIN page of type
>"revmap"
>2020-07-04 17:57:55.915 MSK [934833] DETAIL:  Expected special type
>0000f092, got 00007f7f.

Hmmm, the 7f7f kinda seems like the pattern we use for randomizing
allocated/freed memory. So I thought maybe we're not initializing the
memory properly, or maybe freeing it too early. But I was getting
different patterns, and the reality is way simpler:


test=# SELECT * FROM page_header(get_raw_page('test1', 0));
     lsn    | checksum | flags | lower | upper | special | pagesize | version | prune_xid
-----------+----------+-------+-------+-------+---------+----------+---------+-----------
  0/15BBE80 |        0 |     4 |    28 |  8160 |    8192 |     8192 |       4 |         0
(1 row)

So the page actually does not have any special part, which is where the
type is supposed to be stored. So the BrinPageType probably ends up
reading whatever is immediately after the page. Interesting.

It might be worth adding an assert to check the PageGetSpecialPointer
result is actually within the page.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Sat, Jul 04, 2020 at 10:04:25PM +0200, Tomas Vondra wrote:
>On Sat, Jul 04, 2020 at 04:00:00PM +0000, PG Bug reporting form wrote:
>>The following bug has been logged on the website:
>>
>>Bug reference:      16527
>>Logged by:          Alexander Lakhin
>>Email address:      exclusion@gmail.com
>>PostgreSQL version: 13beta2
>>Operating system:   Ubuntu 20.04
>>Description:
>>
>>Running the following query (based on contrib/pageinspect/sql/brin.sql)
>>under valgrind:
>>CREATE EXTENSION pageinspect;
>>CREATE TABLE test1 (a int, b text);
>>INSERT INTO test1 VALUES (1, 'one');
>>SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));
>>
>>leads to a memory access error:
>>==00:00:00:12.518 934833== Invalid read of size 2
>>==00:00:00:12.518 934833==    at 0x4865AE1: verify_brin_page
>>(brinfuncs.c:107)
>> ...
>>}
>>2020-07-04 17:57:55.915 MSK [934833] ERROR:  page is not a BRIN page of type
>>"revmap"
>>2020-07-04 17:57:55.915 MSK [934833] DETAIL:  Expected special type
>>0000f092, got 00007f7f.
>
>Hmmm, the 7f7f kinda seems like the pattern we use for randomizing
>allocated/freed memory. So I thought maybe we're not initializing the
>memory properly, or maybe freeing it too early. But I was getting
>different patterns, and the reality is way simpler:
>
>
>test=# SELECT * FROM page_header(get_raw_page('test1', 0));
>    lsn    | checksum | flags | lower | upper | special | pagesize | version | prune_xid
>-----------+----------+-------+-------+-------+---------+----------+---------+-----------
> 0/15BBE80 |        0 |     4 |    28 |  8160 |    8192 |     8192 |       4 |         0
>(1 row)
>
>So the page actually does not have any special part, which is where the
>type is supposed to be stored. So the BrinPageType probably ends up
>reading whatever is immediately after the page. Interesting.
>
>It might be worth adding an assert to check the PageGetSpecialPointer
>result is actually within the page.
>

FWIW at first I was puzzled why we're not seeing other failures when a
page unexpectedly does not have a special chunk at the end, but the
reason is pretty simple - the page comes from a table, not from a BRIN
index. I initially missed this detail.

So there probably needs to be some sort of check (in verify_brin_page or
somewhere close) that the page has just the right pg_special value, and
fail with ERROR if not. An assert seems insufficient in this case.

I also wonder if other similar pageinspect cases with mismatched data
(page read from an index, passed to a heap-related functions etc.) have
the same issue.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services