Re: PostgreSQL12 crash bug report - Mailing list pgsql-bugs
From | Andres Freund |
---|---|
Subject | Re: PostgreSQL12 crash bug report |
Date | |
Msg-id | 20190801004000.3xis75ulpun4ha3p@alap3.anarazel.de Whole thread Raw |
In response to | Re: PostgreSQL12 crash bug report (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: PostgreSQL12 crash bug report
|
List | pgsql-bugs |
Hi, On 2019-07-31 19:07:29 -0400, Tom Lane wrote: > I wrote: > > yi huang <yi.codeplayer@gmail.com> writes: > >> [ crashes with ] > >> $ pgbench -h 127.0.0.1 -p 5432 -U exchange exchange -T 2 -c 4 -f /tmp/test.sql > > > As per the stack trace, we're trying to build a new tuple for the output > > of a ValuesScan node, but the targetlist for that seems completely insane: > > After digging through this, it seems clear that it's the fault of > ad0bda5d2, specifically this code in EvalPlanQualSlot: > > *slot = ExecAllocTableSlot(&epqstate->estate->es_tupleTable, > epqstate->origslot->tts_tupleDescriptor, > &TTSOpsVirtual); > > It's setting up an es_epqTupleSlot[] entry on the assumption that it > should have the same tupdesc as the output tuple that's to be rechecked. > This might accidentally fail to malfunction in single-table cases, > but it's quite broken for any join case --- in particular, for the > given test case, the scan tuple for the VALUES node surely doesn't have > the same tupdesc as the join result. :( We really need to improve the test coverage for EPQ :(. I tried to add a few cases, but it's still one of the least tested complicated areas in PG. To make sure I understand - the problem isn't the slot that we've created in nodeModifyTable.c via EvalPlanQualSlot(), right? It's the one we create in EvalPlanQualFetchRowMarks(), because we don't have a proper tuple type handy to create the slot with? And then ExecScanFetch() (on behalf of ValueScan in this case) expects something with the correct format in that slot, which causes the backtrace you show. And this problem occurs "only" for ROW_MARK_COPY, because for everything else we have an associated relation. Previously we simply didn't need to know the type during EPQ setup, because we only stored a HeapTuple anyway. And we'd know the appropriate tupledesc at the places that access the tuple. Right? > I think we need to change the API of EvalPlanQualSlot, because it > cannot determine the correct tupdesc when it's not handed a Relation. > I'm not entirely sure that its immediate callers can either :-( so > this might propagate out a ways. Or perhaps we should move the > slot-making someplace else. I think we'll probably have to split EvalPlanQualSlot() into two functions. One for nodeModifyTable.c and nodeLockRows.c, and the !ROW_MARK_COPY cases in EvalPlanQualFetchRowMarks(). One bigger change - but possibly one worth it - would be to basically move the work done in EvalPlanQualFetchRowMarks() to be done on-demand, at least for ROW_MARK_COPY. We don't need to do /* fetch the whole-row Var for the relation */ datum = ExecGetJunkAttribute(epqstate->origslot, aerm->wholeAttNo, &isNull); /* non-locked rels could be on the inside of outer joins */ if (isNull) continue; that early in the query. We could call EvalPlanQualFetchRowMark() (note the singular) from ExecScanFetch(), that ought to currently be the only place that accesses ROW_MARK_COPY type rowmarks? We could do the same for the other types of rowmarks, but I think that'd be semantically somewhat murkier. That would have the advantage that we'd not unnecessarily put row marks we might never need into slots. There's probably not that many cases where that matters, though. Or we could keep enough information for ROW_MARK_COPY RowMarks around to create the necessary tupledesc (there's a number of variants as to how - don't zap the information in add_rte_to_flat_rtable, have setrefs.c somehow keep maintain rti indexed pointers to Plan nodes, ...). Not sure yet what's best here. Greetings, Andres Freund
pgsql-bugs by date: