BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger |
Date | |
Msg-id | 17798-0907404928dcf0dd@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17798 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 15.2 Operating system: Ubuntu 22.04 Description: When executing the following queries under valgrind: cat << "EOF" | psql CREATE TABLE bruttest(cnt int DEFAULT 0, t text); CREATE FUNCTION noop_tfunc() RETURNS trigger LANGUAGE plpgsql AS $$ BEGIN RETURN NEW; END$$; CREATE TRIGGER brupdate_trigger BEFORE UPDATE ON bruttest FOR EACH ROW EXECUTE FUNCTION noop_tfunc(); INSERT INTO bruttest(t) VALUES (repeat('x', 1000)); EOF for ((i=1;i<=4;i++)); do echo "iteration $i" psql -c "UPDATE bruttest SET cnt = cnt + 1" & psql -c "UPDATE bruttest SET cnt = cnt + 1" & wait done I get a failure: ... iteration 4 UPDATE 1 server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost server.log contains: ==00:00:00:08.877 3381345== Invalid read of size 1 ==00:00:00:08.877 3381345== at 0x1E4721: heap_compute_data_size (heaptuple.c:138) ==00:00:00:08.877 3381345== by 0x1E5674: heap_form_tuple (heaptuple.c:1061) ==00:00:00:08.877 3381345== by 0x413AF6: tts_buffer_heap_materialize (execTuples.c:748) ==00:00:00:08.877 3381345== by 0x414CB9: ExecFetchSlotHeapTuple (execTuples.c:1654) ==00:00:00:08.877 3381345== by 0x3DF10B: ExecBRUpdateTriggers (trigger.c:3074) ==00:00:00:08.877 3381345== by 0x438E2C: ExecUpdatePrologue (nodeModifyTable.c:1912) ==00:00:00:08.877 3381345== by 0x43A623: ExecUpdate (nodeModifyTable.c:2269) ==00:00:00:08.877 3381345== by 0x43D8BD: ExecModifyTable (nodeModifyTable.c:3856) ==00:00:00:08.877 3381345== by 0x40E7F1: ExecProcNodeFirst (execProcnode.c:464) ==00:00:00:08.877 3381345== by 0x406E19: ExecProcNode (executor.h:259) ==00:00:00:08.877 3381345== by 0x406E19: ExecutePlan (execMain.c:1636) ==00:00:00:08.877 3381345== by 0x406FF9: standard_ExecutorRun (execMain.c:363) ==00:00:00:08.877 3381345== by 0x4070C5: ExecutorRun (execMain.c:307) ==00:00:00:08.877 3381345== Address 0x957d114 is in a rw- anonymous segment ==00:00:00:08.877 3381345== { <insert_a_suppression_name_here> ... ==00:00:00:08.877 3381345== ==00:00:00:08.877 3381345== Exit program on first error (--exit-on-first-error=yes) 2023-02-17 11:27:17.663 MSK [3381268] LOG: server process (PID 3381345) exited with exit code 1 2023-02-17 11:27:17.663 MSK [3381268] DETAIL: Failed process was running: UPDATE bruttest SET cnt = cnt + 1 I've also seen a relevant failure detected by asan (on master): ==1912964==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f223dd52c78 at pc 0x55c9e846146a bp 0x7fff0c63e560 sp 0x7fff0c63dd30 WRITE of size 42205122 at 0x7f223dd52c78 thread T0 ... #8 0x000055c9e846953f in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) () #9 0x000055c9e846148c in __asan_memcpy () #10 0x000055c9e84d09c5 in fill_val (att=<optimized out>, bit=<optimized out>, bitmask=bitmask@entry=0x7fff0c63e620, dataP=dataP@entry=0x7fff0c63e5e0, infomask=infomask@entry=0x7f223dd09864, datum=<optimized out>, isnull=<optimized out>) at heaptuple.c:270 #11 0x000055c9e84d0080 in heap_fill_tuple (tupleDesc=<optimized out>, values=<optimized out>, isnull=<optimized out>, data=<optimized out>, data_size=<optimized out>, infomask=<optimized out>, bit=<optimized out>) at heaptuple.c:336 #12 0x000055c9e84d530e in heap_form_tuple (tupleDescriptor=0x7f22898f76e8, values=0x1d3084, isnull=0x6250000c85d8) at heaptuple.c:1090 #13 0x000055c9e8bc8d0e in tts_buffer_heap_materialize (slot=0x6250000c8548) at execTuples.c:748 #14 0x000055c9e8bcd02a in ExecFetchSlotHeapTuple (slot=slot@entry=0x6250000c8548, materialize=<optimized out>, shouldFree=shouldFree@entry=0x7fff0c63e810) at execTuples.c:1654 #15 0x000055c9e8afbd9a in ExecBRUpdateTriggers (estate=<optimized out>, epqstate=<optimized out>, relinfo=0x6250000a2e80, tupleid=<optimized out>, fdw_trigtuple=<optimized out>, newslot=<optimized out>, tmfd=<optimized out>) at trigger.c:3022 ... According to git bisect, first bad commit for this anomaly is 86dc90056. That commit changed just one thing in trigger.c: - epqslot_clean = ExecFilterJunk(relinfo->ri_junkFilter, epqslot_candidate); + epqslot_clean = ExecGetUpdateNewTuple(relinfo, epqslot_candidate, + oldslot); I've found the following explanation for the failure: 1) After the ExecGetUpdateNewTuple() call the newslot and the oldslot are linked together (their slot->tts_values[1] in this case point to the same memory address (inside the oldslot' buffer)). 2) Previously, GetTupleForTrigger() could get a tuple with a new buffer, so the oldslot would be the only user of the buffer at that moment. (The newslot doesn't become an official user of the buffer.) 3) Then, trigtuple = ExecFetchSlotHeapTuple(oldslot, ...) invokes tts_buffer_heap_materialize() where the oldslot->buffer is released. 4) Finally, newtuple = ExecFetchSlotHeapTuple(newslot, ...) invokes tts_buffer_heap_materialize() where an incorrect access to memory that became anonymous occurs, and that is detected by valgrind. If not detected, different consequences are possible (in the asan case it was memcpy with an incorrectly read extra large data_len).
pgsql-bugs by date: