BUG #17950: Incorrect memory access in gtsvector_picksplit() - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17950: Incorrect memory access in gtsvector_picksplit() |
Date | |
Msg-id | 17950-6c80a8d2b94ec695@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17950: Incorrect memory access in gtsvector_picksplit()
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17950 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 16beta1 Operating system: Ubuntu 22.04 Description: The following script: CREATE TABLE test_tsvector(t text, a tsvector); SELECT 'COPY test_tsvector FROM ''.../src/test/regress/data/tsearch.data'';' FROM generate_series(1, 19) \gexec CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON test_tsvector FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(a, 'pg_catalog.english', t); SELECT 'COPY test_tsvector FROM ''.../src/test/regress/data/tsearch.data'';' FROM generate_series(1, 38) \gexec CREATE INDEX gistidx ON test_tsvector USING gist (a tsvector_ops(siglen=1)); -- I believe it's not the only way to get a data pattern needed, so -- probably the repro could be simplified, if desired. triggers a Valgrind-detected memory access error: ==00:00:00:53.414 342514== Invalid read of size 1 ==00:00:00:53.414 342514== at 0x79787D: pg_popcount (pg_bitutils.c:332) ==00:00:00:53.414 342514== by 0x6F93C6: sizebitvec (tsgistidx.c:488) ==00:00:00:53.414 342514== by 0x6FA24A: gtsvector_picksplit (tsgistidx.c:731) ==00:00:00:53.414 342514== by 0x74146D: FunctionCall2Coll (fmgr.c:1132) ==00:00:00:53.414 342514== by 0x217E65: gistUserPicksplit (gistsplit.c:433) ==00:00:00:53.414 342514== by 0x2184A5: gistSplitByKey (gistsplit.c:697) ==00:00:00:53.414 342514== by 0x20C237: gistSplit (gist.c:1450) ==00:00:00:53.414 342514== by 0x20CA32: gistplacetopage (gist.c:309) ==00:00:00:53.414 342514== by 0x20DBFB: gistinserttuples (gist.c:1278) ==00:00:00:53.414 342514== by 0x20DF85: gistfinishsplit (gist.c:1376) ==00:00:00:53.414 342514== by 0x20DC79: gistinserttuples (gist.c:1305) ==00:00:00:53.414 342514== by 0x20E39E: gistinserttuple (gist.c:1231) ==00:00:00:53.414 342514== Address 0x7386c68 is 16,024 bytes inside a block of size 16,384 alloc'd ==00:00:00:53.414 342514== at 0x4848899: malloc (vg_replace_malloc.c:381) ==00:00:00:53.414 342514== by 0x7602C7: AllocSetAlloc (aset.c:924) ==00:00:00:53.414 342514== by 0x76C26A: palloc (mcxt.c:1240) ==00:00:00:53.414 342514== by 0x20C24A: gistSplit (gist.c:1453) ==00:00:00:53.414 342514== by 0x20CA32: gistplacetopage (gist.c:309) ==00:00:00:53.414 342514== by 0x20DBFB: gistinserttuples (gist.c:1278) ==00:00:00:53.414 342514== by 0x20E39E: gistinserttuple (gist.c:1231) ==00:00:00:53.414 342514== by 0x20E940: gistdoinsert (gist.c:886) ==00:00:00:53.414 342514== by 0x21152B: gistBuildCallback (gistbuild.c:929) ==00:00:00:53.414 342514== by 0x24420D: heapam_index_build_range_scan (heapam_handler.c:1708) ==00:00:00:53.414 342514== by 0x2119E4: table_index_build_scan (tableam.h:1781) ==00:00:00:53.414 342514== by 0x2119E4: gistbuild (gistbuild.c:317) ==00:00:00:53.414 342514== by 0x2E06F5: index_build (index.c:3032) ==00:00:00:53.414 342514== (Several runs might be required for the issue reproduction.) With the additional debug logging in gtsvector_picksplit(): @@ -722,6 +722,11 @@ gtsvector_picksplit(PG_FUNCTION_ARGS) continue; } +if (!cache[j].allistrue) { +elog(LOG, "!!!gtsvector_picksplit| j: %d, cache[j].sign: %p, GETSIGN(cache[j].sign): %p", j, cache[j].sign, GETSIGN(cache[j].sign)); +VALGRIND_CHECK_MEM_IS_DEFINED(GETSIGN(cache[j].sign), siglen); +} + if (ISALLTRUE(datum_l) || cache[j].allistrue) { if (ISALLTRUE(datum_l) && cache[j].allistrue) (and #include "utils/memdebug.h") I see: cache[j].sign: 0x723a999, GETSIGN(cache[j].sign): 0x723a9a1 ==00:00:00:18.356 351519== Unaddressable byte(s) found during client check request ==00:00:00:18.356 351519== at 0x6FA2BE: gtsvector_picksplit (tsgistidx.c:727) ... ==00:00:00:18.356 351519== Address 0x723a9a1 is 4,977 bytes inside a block of size 8,192 alloc'd Reproduced starting from 911e70207. But with the slight variation: CREATE INDEX gistidx ON test_tsvector USING gist (a tsvector_ops); and the debugging patch: @@ -711,6 +712,9 @@ gtsvector_picksplit(PG_FUNCTION_ARGS) else size_alpha = hemdistsign(cache[j].sign, GETSIGN(datum_l)); +if (!cache[j].allistrue) +VALGRIND_CHECK_MEM_IS_DEFINED(GETSIGN(cache[j].sign), SIGLEN); + if (ISALLTRUE(datum_r) || cache[j].allistrue) { if (ISALLTRUE(datum_r) && cache[j].allistrue) it's reproduced even on 911e70207~1: ==00:00:00:15.858 370963== Unaddressable byte(s) found during client check request ==00:00:00:15.858 370963== at 0x636E0E: gtsvector_picksplit (tsgistidx.c:716) ==00:00:00:15.858 370963== by 0x67B6D3: FunctionCall2Coll (fmgr.c:1162) ==00:00:00:15.858 370963== by 0x1FB662: gistUserPicksplit (gistsplit.c:433) ==00:00:00:15.858 370963== by 0x1FBCCF: gistSplitByKey (gistsplit.c:697) ==00:00:00:15.858 370963== by 0x1F088A: gistSplit (gist.c:1441) ==00:00:00:15.858 370963== by 0x1F0F3C: gistplacetopage (gist.c:302) ==00:00:00:15.858 370963== by 0x1F202F: gistinserttuples (gist.c:1270) ==00:00:00:15.858 370963== by 0x1F273A: gistinserttuple (gist.c:1223) ==00:00:00:15.858 370963== by 0x1F2BCE: gistdoinsert (gist.c:879) ==00:00:00:15.858 370963== by 0x1F4E8F: gistBuildCallback (gistbuild.c:470) ==00:00:00:15.858 370963== by 0x2262BD: heapam_index_build_range_scan (heapam_handler.c:1659) ==00:00:00:15.858 370963== by 0x1F50E7: table_index_build_scan (tableam.h:1540) ==00:00:00:15.858 370963== by 0x1F50E7: gistbuild (gistbuild.c:196) ==00:00:00:15.858 370963== Address 0x11a3a68b is 4,939 bytes inside a block of size 16,384 alloc'd ==00:00:00:15.858 370963== at 0x4848899: malloc (vg_replace_malloc.c:381) ==00:00:00:15.858 370963== by 0x699954: AllocSetAlloc (aset.c:941) ==00:00:00:15.858 370963== by 0x6A26CC: palloc (mcxt.c:963) ==00:00:00:15.858 370963== by 0x636990: gtsvector_picksplit (tsgistidx.c:613) ... So it looks like this defect exists in core since 140d4ebcb. IIUC, using the GETSIGN macro with cache[j].sign is a mistake -- it erroneously adds 8 to an address of the sign field, so for the last j it leads to an out-of-bounds memory read.
pgsql-bugs by date: