BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Date
Msg-id 17798-0907404928dcf0dd@postgresql.org
Whole thread Raw
Responses Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17798
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 15.2
Operating system:   Ubuntu 22.04
Description:

When executing the following queries under valgrind:
cat << "EOF" | psql
CREATE TABLE bruttest(cnt int DEFAULT 0, t text);
CREATE FUNCTION noop_tfunc() RETURNS trigger LANGUAGE plpgsql AS $$ BEGIN
RETURN NEW; END$$;
CREATE TRIGGER brupdate_trigger BEFORE UPDATE ON bruttest FOR EACH ROW
EXECUTE FUNCTION noop_tfunc();
INSERT INTO bruttest(t) VALUES (repeat('x', 1000));
EOF

for ((i=1;i<=4;i++)); do
  echo "iteration $i"
  psql -c "UPDATE bruttest SET cnt = cnt + 1" &
  psql -c "UPDATE bruttest SET cnt = cnt + 1" &
  wait
done

I get a failure:
...
iteration 4
UPDATE 1
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

server.log contains:
==00:00:00:08.877 3381345== Invalid read of size 1
==00:00:00:08.877 3381345==    at 0x1E4721: heap_compute_data_size
(heaptuple.c:138)
==00:00:00:08.877 3381345==    by 0x1E5674: heap_form_tuple
(heaptuple.c:1061)
==00:00:00:08.877 3381345==    by 0x413AF6: tts_buffer_heap_materialize
(execTuples.c:748)
==00:00:00:08.877 3381345==    by 0x414CB9: ExecFetchSlotHeapTuple
(execTuples.c:1654)
==00:00:00:08.877 3381345==    by 0x3DF10B: ExecBRUpdateTriggers
(trigger.c:3074)
==00:00:00:08.877 3381345==    by 0x438E2C: ExecUpdatePrologue
(nodeModifyTable.c:1912)
==00:00:00:08.877 3381345==    by 0x43A623: ExecUpdate
(nodeModifyTable.c:2269)
==00:00:00:08.877 3381345==    by 0x43D8BD: ExecModifyTable
(nodeModifyTable.c:3856)
==00:00:00:08.877 3381345==    by 0x40E7F1: ExecProcNodeFirst
(execProcnode.c:464)
==00:00:00:08.877 3381345==    by 0x406E19: ExecProcNode (executor.h:259)
==00:00:00:08.877 3381345==    by 0x406E19: ExecutePlan (execMain.c:1636)
==00:00:00:08.877 3381345==    by 0x406FF9: standard_ExecutorRun
(execMain.c:363)
==00:00:00:08.877 3381345==    by 0x4070C5: ExecutorRun (execMain.c:307)
==00:00:00:08.877 3381345==  Address 0x957d114 is in a rw- anonymous
segment
==00:00:00:08.877 3381345==
{
   <insert_a_suppression_name_here>
...
==00:00:00:08.877 3381345==
==00:00:00:08.877 3381345== Exit program on first error
(--exit-on-first-error=yes)
2023-02-17 11:27:17.663 MSK [3381268] LOG:  server process (PID 3381345)
exited with exit code 1
2023-02-17 11:27:17.663 MSK [3381268] DETAIL:  Failed process was running:
UPDATE bruttest SET cnt = cnt + 1

I've also seen a relevant failure detected by asan (on master):
==1912964==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x7f223dd52c78 at pc 0x55c9e846146a bp 0x7fff0c63e560 sp 0x7fff0c63dd30
WRITE of size 42205122 at 0x7f223dd52c78 thread T0
...
#8  0x000055c9e846953f in __asan::ReportGenericError(unsigned long, unsigned
long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool)
()
#9  0x000055c9e846148c in __asan_memcpy ()
#10 0x000055c9e84d09c5 in fill_val (att=<optimized out>, bit=<optimized
out>, bitmask=bitmask@entry=0x7fff0c63e620, 
    dataP=dataP@entry=0x7fff0c63e5e0,
infomask=infomask@entry=0x7f223dd09864, datum=<optimized out>, 
    isnull=<optimized out>) at heaptuple.c:270
#11 0x000055c9e84d0080 in heap_fill_tuple (tupleDesc=<optimized out>,
values=<optimized out>, isnull=<optimized out>, 
    data=<optimized out>, data_size=<optimized out>, infomask=<optimized
out>, bit=<optimized out>) at heaptuple.c:336
#12 0x000055c9e84d530e in heap_form_tuple (tupleDescriptor=0x7f22898f76e8,
values=0x1d3084, isnull=0x6250000c85d8)
    at heaptuple.c:1090
#13 0x000055c9e8bc8d0e in tts_buffer_heap_materialize (slot=0x6250000c8548)
at execTuples.c:748
#14 0x000055c9e8bcd02a in ExecFetchSlotHeapTuple
(slot=slot@entry=0x6250000c8548, materialize=<optimized out>, 
    shouldFree=shouldFree@entry=0x7fff0c63e810) at execTuples.c:1654
#15 0x000055c9e8afbd9a in ExecBRUpdateTriggers (estate=<optimized out>,
epqstate=<optimized out>, 
    relinfo=0x6250000a2e80, tupleid=<optimized out>,
fdw_trigtuple=<optimized out>, newslot=<optimized out>, 
    tmfd=<optimized out>) at trigger.c:3022
...

According to git bisect, first bad commit for this anomaly is 86dc90056.

That commit changed just one thing in trigger.c:
-                       epqslot_clean =
ExecFilterJunk(relinfo->ri_junkFilter, epqslot_candidate);
+                       epqslot_clean = ExecGetUpdateNewTuple(relinfo,
epqslot_candidate,
+
                     oldslot);

I've found the following explanation for the failure:
1) After the ExecGetUpdateNewTuple() call the newslot and the oldslot are
 linked together (their slot->tts_values[1] in this case point to the same
 memory address (inside the oldslot' buffer)).
2) Previously, GetTupleForTrigger() could get a tuple with a new buffer,
 so the oldslot would be the only user of the buffer at that moment.
 (The newslot doesn't become an official user of the buffer.)
3) Then, trigtuple = ExecFetchSlotHeapTuple(oldslot, ...) invokes
 tts_buffer_heap_materialize() where the oldslot->buffer is released.
4) Finally, newtuple = ExecFetchSlotHeapTuple(newslot, ...) invokes
 tts_buffer_heap_materialize() where an incorrect access to memory
 that became anonymous occurs, and that is detected by valgrind.
 If not detected, different consequences are possible (in the asan case
 it was memcpy with an incorrectly read extra large data_len).


pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #17797: connection error
Next
From: Samo Dadela
Date:
Subject: pg_restore: warning: could not find where to insert IF EXISTS in statement "-- *not* dropping schema, since initdb creates it