Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger - Mailing list pgsql-bugs
From | Alexander Lakhin |
---|---|
Subject | Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger |
Date | |
Msg-id | e989408c-1838-61cd-c968-1fcf47c6fbba@gmail.com Whole thread Raw |
In response to | Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger |
List | pgsql-bugs |
Hello Tom, I've found enough time this week to investigate the issue deeper and now I can answer your questions, hopefully. 30.04.2023 01:24, Tom Lane wrote: > I am not entirely comfortable with blaming this on 86dc90056 though, > even though "git bisect" seems to finger that. The previous coding > there with ExecFilterJunk() also produced a virtual tuple, as you > can see by examining ExecFilterJunk, so why didn't the problem > manifest before? I think the answer may be that before 86dc90056, > we were effectively relying on the underlying SeqScan node to keep > the buffer pinned, but now we're re-fetching the tuple and no pin > is held by lower plan nodes. I'm not entirely sure about that though; > ISTM a SeqScan ought to hold pin on its current tuple in any case. > However, if the UPDATE plan were more complicated (involving a join) > then it's quite believable that we'd get here with no relevant pin > being held. I couldn't confirm that the issue is specific to a SeqScan plan; it is reproduced (and fixed by the patch proposed) with a more complex query: cat << "EOF" | psql CREATE TABLE bruttest(id int, cnt int DEFAULT 0, t text); CREATE FUNCTION noop_tfunc() RETURNS trigger LANGUAGE plpgsql AS $$ BEGIN RETURN NEW; END$$; CREATE TRIGGER brupdate_trigger BEFORE UPDATE ON bruttest FOR EACH ROW EXECUTE FUNCTION noop_tfunc(); INSERT INTO bruttest(id, t) VALUES (1, repeat('x', 1000)); CREATE TABLE t2(id int, delta int); INSERT INTO t2 VALUES(1, 1); EOF for ((i=1;i<=4;i++)); do echo "iteration $i" psql -c "UPDATE bruttest SET cnt = cnt + delta FROM t2 WHERE bruttest.id = t2.id" & psql -c "UPDATE bruttest SET cnt = cnt + delta FROM t2 WHERE bruttest.id = t2.id" & wait done --- psql -c "EXPLAIN UPDATE bruttest SET cnt = cnt + delta FROM t2 WHERE bruttest.id = t2.id" QUERY PLAN ------------------------------------------------------------------------------- Update on bruttest (cost=241.88..485.18 rows=0 width=0) -> Merge Join (cost=241.88..485.18 rows=13560 width=16) Merge Cond: (bruttest.id = t2.id) -> Sort (cost=83.37..86.37 rows=1200 width=14) Sort Key: bruttest.id -> Seq Scan on bruttest (cost=0.00..22.00 rows=1200 width=14) -> Sort (cost=158.51..164.16 rows=2260 width=14) Sort Key: t2.id -> Seq Scan on t2 (cost=0.00..32.60 rows=2260 width=14) (9 rows) IIUC, the buffer is pinned by oldslot only, and newslot uses it without explicit pinning because of the following call chain: epqslot_clean = ExecGetUpdateNewTuple(relinfo=relinfo, planSlot=epqslot_candidate, oldSlot=oldslot): ProjectionInfo *newProj = relinfo->ri_projectNew; ... econtext->ecxt_scantuple = oldSlot; ... ExecProject(projInfo=newProj): ExprState *state = &projInfo->pi_state; ... ExecEvalExprSwitchContext(state=state, econtext=projInfo->pi_exprContext): state->evalfunc(state, econtext, isNull) -> ExecInterpExpr(state, econtext, isNull): resultslot = state->resultslot; ... scanslot = econtext->ecxt_scantuple; ... EEOP_SCAN_FETCHSOME: slot_getsomeattrs(scanslot, op->d.fetch.last_var); ... EEOP_ASSIGN_SCAN_VAR: resultslot->tts_values[resultnum] = scanslot->tts_values[attnum]; Note that ExecGetUpdateNewTuple() returns relinfo->ri_projectNew->pi_state.resultslot; So the only place where the buffer could be pinned for newslot correctly is EEOP_ASSIGN_SCAN_VAR, but I think that changing something there is not an option. > The practical impact of that concern is that I'm not convinced that > the proposed patch fixes all variants of the issue. Maybe we need > to materialize even if newslot != epqslot_clean. I investigated all code paths and came to a conclusion that the case newslot != epqslot_clean is impossible. 1) When ExecBRUpdateTriggers() called via ExecUpdatePrologue(), ..., ExecModifyTable(), we have: ExecModifyTable(): /* Initialize projection info if first time for this table */ if (unlikely(!resultRelInfo->ri_projectNewInfoValid)) ExecInitUpdateProjection(node, resultRelInfo); slot = ExecGetUpdateNewTuple(resultRelInfo, context.planSlot, oldSlot); ... ExecUpdate(..., resultRelInfo=resultRelInfo, ..., slot=slot, ...): ExecUpdatePrologue(..., resultRelInfo=resultRelInfo, ... , ..., slot=slot, ...): ExecMaterializeSlot(slot); ... ExecBRUpdateTriggers(..., relinfo=resultRelInfo, ..., newslot=slot, ...) epqslot_clean = ExecGetUpdateNewTuple(relinfo, epqslot_candidate, oldslot); (so, epqslot_clean and newslot both are equal to relinfo->ri_projectNew->pi_state.resultslot) 2) When ExecBRUpdateTriggers() called via ExecMergeMatched(), .. ., ExecMerge(), epqslot_candidate == NULL always, because of /* * Recheck the tuple using EPQ. For MERGE, we leave this * to the caller (it must do additional rechecking, and * might end up executing a different action entirely). */ inside GetTupleForTrigger(). 3) When ExecBRUpdateTriggers() called via ExecSimpleRelationUpdate(), ..., apply_handle_tuple_routing(): FindReplTupleInLocalRel() ... ExecSimpleRelationUpdate() A concurrent update of target tuple prevented by table_tuple_lock() in RelationFindReplTupleSeq()/RelationFindReplTupleByIndex(), called from FindReplTupleInLocalRel(). Moreover, if epqslot_candidate != NULL were true when ExecBRUpdateTriggers() called from ExecSimpleRelationUpdate()/ExecMergeMatched(), then we would get a crash inside epqslot_clean = ExecGetUpdateNewTuple(relinfo, epqslot_candidate, oldslot); because of ExecGetUpdateNewTuple(relinfo, ...) ProjectionInfo *newProj = relinfo->ri_projectNew; ... Assert(relinfo->ri_projectNewInfoValid); ... econtext = newProj->pi_exprContext; as ExecInitUpdateProjection(), which initializes ri_projectNew, called only from ExecModifyTable(). So I think it's safe to assume that newslot == epqslot_clean when epqslot_candidate != NULL, thus ExecCopySlot may be removed. > Maybe we need to > do it even if there wasn't a concurrent update. Without a concurrent update, i. e. when epqslot_candidate == NULL, newslot is filled earlier, in ExecModifyTable(): oldSlot = resultRelInfo->ri_oldTupleSlot; ... slot = ExecGetUpdateNewTuple(resultRelInfo, context.planSlot, oldSlot); so newslot uses oldSlot existing in ExecModifyTable(), not oldslot created in ExecBRUpdateTriggers(). > Maybe we need to > do it in pre-v14 branches, too. As I noted in my previous message, before 86dc90056 a different slot was used when getting "some attrs", so I see no need to fix that state. > It's certainly not obvious that this > function ought to assume anything about which slots are pointing at > what. I think the problem is in the sequence of calls: epqslot_clean = ExecGetUpdateNewTuple(relinfo, epqslot_candidate, oldslot); trigtuple = ExecFetchSlotHeapTuple(oldslot, true, &should_free_trig); As ExecGetUpdateNewTuple() can fill attributes in oldslot and ExecFetchSlotHeapTuple() can release the oldslot's underlying storage (as the comment for that function says), ExecMaterializeSlot() should be called in between to save the new tuple contents. > Also, if we do materialize here, does this code a little further down > become redundant? > > if (should_free_trig && newtuple == trigtuple) > ExecMaterializeSlot(newslot); That is another interesting case, and I think it's the separate one. I could not reproduce that issue (commit 75e03eabea from 2019 mentions zheap), but I suppose it was real at the moment of the commit, and it was two years before 86dc90056. So I think that ExecMaterializeSlot() inside the condition "if (epqslot_candidate != NULL)" would not cover that case. As an outcome, I propose an improved patch, that: 1) Removes unreachable ExecCopySlot() call; 2) Adds one ExecMaterializeSlot() call, that covers both cases; 3) Removes separate ExecMaterializeSlot() call, that becomes redundant. It simplifies ExecBRUpdateTriggers() a little, but if it might affect performance significantly (I don't see how), maybe leave previous ExecMaterializeSlot() call intact, and add another one just after epqslot_clean = ExecGetUpdateNewTuple(...); > A completely different way of looking at it is that we should not > treat this as ExecBRUpdateTriggers's bug at all. The slot mechanisms > are supposed to protect the data referenced by a slot, so why is that > failing to happen in this example? The correct fix might involve > newslot acquiring a buffer pin, for example. Unfortunately, I could not find a place where such pin could be acquired (except for EEOP_ASSIGN_SCAN_VAR at the lower level). > Bottom line is that I'm afraid the present patch is band-aiding a > specific problem without solving the general case. The next one is more generic, but still limited to the ExecBRUpdateTriggers() bounds, as I still see no problem outside. Best regards, Alexander
Attachment
pgsql-bugs by date: