Re: [HACKERS] WIP: Faster Expression Processing v4 - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] WIP: Faster Expression Processing v4 |
Date | |
Msg-id | 26088.1490315792@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] WIP: Faster Expression Processing v4 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] WIP: Faster Expression Processing v4
Re: [HACKERS] WIP: Faster Expression Processing v4 |
List | pgsql-hackers |
I found a rather nasty bug :-( ... the comment in EEOP_INNER_VAR_FIRST about + /* + * Can't assert tts_nvalid, as wholerow var evaluation or such + * could have materialized the slot - but the contents are still + * valid :/ + */ + Assert(op->d.var.attnum >= 0); is actually averting its eyes from a potentially serious problem. If we go through ExecEvalWholeRowVar, which calls ExecFetchSlotTupleDatum, and ExecFetchSlotTuple decides it has to materialize the tuple, then ExecMaterializeSlot does this: /* * Drop the pin on the referenced buffer, if there is one. */ if (BufferIsValid(slot->tts_buffer)) ReleaseBuffer(slot->tts_buffer); slot->tts_buffer = InvalidBuffer; /* * Mark extracted state invalid. This is important because the slot is * not supposed to depend any more on theprevious external data; we * mustn't leave any dangling pass-by-reference datums in tts_values. * However, we havenot actually invalidated any such datums, if there * happen to be any previously fetched from the slot. (Note inparticular * that we have not pfree'd tts_mintuple, if there is one.) */ slot->tts_nvalid = 0; The problem here is that once we drop the buffer pin, any pointers we may have into on-disk data are dangling pointers --- we're at risk of some other backend taking away that shared buffer. (So I'm afraid that the commentary there is overly optimistic.) So even though those pointers may still be there beyond tts_nvalid, subsequent references to them are very dangerous. I think that it's pretty hard to hit this in practice, maybe impossible, because the normal case for an "on-disk" tuple is that TTS_HAS_PHYSICAL_TUPLE is true, so that ExecFetchSlotTuple won't change the state of the slot. If we have a virtual tuple that has to be materialized, then by that very token it won't have a buffer pin to drop. But I find this fragile as heck, and the aforesaid patch comment surely isn't adequately documenting the safety considerations. Also, if there ever were a live bug here, reproducing it would be damn hard because of the low probability that a just-unpinned buffer would get replaced any time soon. (Hm, I wonder whether the buffer cache needs something analogous to the syscaches' CLOBBER_CACHE_ALWAYS behavior...) Besides which, I really really don't like the lack of an "attnum < tts_nvalid" assertion there; that's just obviously failing to check for very simple bugs, such as getting the FETCHSOME steps wrong. So I think that we have got to fix ExecEvalWholeRowVar so that it doesn't clobber the state of the slot. Right at the moment, the only way to do that seems to be to do this instead of ExecFetchSlotTupleDatum: tuple = ExecCopySlotTuple(slot); dtuple = (HeapTupleHeader) DatumGetPointer(heap_copy_tuple_as_datum(tuple, slot->tts_tupleDescriptor)); heap_freetuple(tuple); That's kind of annoying because of the double copying involved, but tuptoaster.c doesn't expose any functionality for this except heap_copy_tuple_as_datum(). I figure we can improve it later --- it looks like we can refactor heap_copy_tuple_as_datum to expose a function that reads from Datum/isnull arrays, and then call it on the slot's tts_values/tts_isnull arrays. Seems like it might be a good idea to think a bit harder about the TupleTableSlot APIs, too, and reduce the number of cases where execTuples exposes destructive changes to the state of a slot. Also, while trying to test the above scenario, I realized that the patch as submitted was being way too cavalier about where it was applying CheckVarSlotCompatibility and so on. The ASSIGN_FOO_VAR steps, for instance, had no protection at all. Think I have that all fixed up though. I hope to have a fully reviewed patch to pass back to you tomorrow. Or Saturday at the latest. regards, tom lane
pgsql-hackers by date: