Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() - Mailing list pgsql-bugs
From | Andrew Dunstan |
---|---|
Subject | Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() |
Date | |
Msg-id | 34ff6307-d5be-d008-56ed-6249f3cd0749@dunslane.net Whole thread Raw |
In response to | Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() (Alexander Lakhin <exclusion@gmail.com>) |
List | pgsql-bugs |
On 2023-07-15 Sa 03:00, Alexander Lakhin wrote: > Hello, > > 28.06.2023 21:49, Tom Lane wrote: >> The bad news is that while investigating this I came across >> another hazard that seems much messier to fix. To wit, if >> you have a tuple with "missing" pass-by-ref columns, then >> some of the columns output by heap_deform_tuple() will >> contain pointers into the tupdesc (cf. getmissingattr()). >> That's not sufficient lifespan in the scenario we're dealing >> with here: if the tupdesc gets trashed anytime before the >> eventual ExecEvalFieldStoreForm(), we'll have garbage in the >> result tuple. > > Please excuse me, as I'm definitely late to the party, but may I ask some > questions to understand the issue discussed? I still can't see a > practical > way to get garbage data in a tuple with missing columns, so maybe I'm > missing something (except for columns). > > If we talk about ExecEvalFieldStoreDeForm() -> heap_deform_tuple() -> > getmissingattr(), then I can't find a way to add a non-null default for a > record type to exploit this path. The rowtypes test contains: > -- at the moment this will not work due to ALTER TABLE inadequacy: > alter table fullname add column suffix text default ''; > ERROR: cannot alter table "fullname" because column "people.fn" uses > its row type > > So it seems to me, that ExecEvalFieldStoreDeForm() can't be called for a > record/tuple having missing non-null attributes. And even if ALTER > TABLE was > "adequate", does it mean that ExecEvalFieldStoreDeForm() would return > default > values from the underlying table type? > >> Worse, I fear there may be many other places with latent bugs of the >> same ilk. Before the attmissingval code was added, one could assume >> that the result of heap_deform_tuple would hold good as long as you >> had a pin on the source tuple. But now it depends on metadata as >> well, and I don't think we have a coherent story about that. > > But is it true, that having a pin on tupdesc is enough to make sure that > the result of heap_deform_tuple() holds good? > > If so, then probably we should find places, where tupdesc gets unpinned > before that result is saved somewhere. > I see the following callers of the heap_deform_tuple(): > src/backend/replication/logical/reorderbuffer.c > ReorderBufferToastReplace(): heap_deform_tuple() followed by > heap_form_tuple() > src/backend/executor/execTuples.c > ExecForceStoreHeapTuple(), ExecForceStoreMinimalTuple(), > ExecStoreHeapTupleDatum(): use long-living and hopefully pinned > slot->tts_tupleDescriptor > src/backend/executor/execExprInterp.c > ExecEvalFieldStoreDeForm(): the above question applies here > src/backend/executor/spi.c > SPI_modifytuple(): heap_deform_tuple() followed by heap_form_tuple() > src/backend/utils/adt/rowtypes.c > record_out(), hash_record(), hash_record_extended(): tupdesc > released after extracting/processing data > record_send(): tupdesc released after sending data > record_cmp(), record_eq(), record_image_cmp(), record_image_eq(): > tupdesc1. tupdesc2 released after comparing data > src/backend/utils/adt/jsonfuncs.c > populate_record(): heap_deform_tuple() followed by heap_form_tuple() > src/backend/utils/adt/expandedrecord.c > deconstruct_expanded_record() uses long-living (pinned) > erh->er_tupdesc > src/backend/utils/adt/ri_triggers.c > RI_Initial_Check(), RI_PartitionRemove_Check(): use long-living? > SPI_tuptable->tupdesc > src/backend/access/heap/heapam_handler.c > reform_and_rewrite_tuple(): uses RelationGetDescr(OldHeap) > (pinned, I suppose) > src/backend/access/heap/heaptoast.c > heap_toast_delete(), heap_toast_insert_or_update() use long-living > rel->rd_att; > toast_flatten_tuple(): heap_deform_tuple(..., tupleDesc, ...) > followed by heap_form_tuple(..., tupleDesc, ...) > toast_flatten_tuple_to_datum(): heap_deform_tuple followed by > detoasting attrs > src/backend/access/heap/heapam.c > ExtractReplicaIdentity(): heap_deform_tuple(..., desc, ...) > followed by heap_form_tuple(..., desc, ...) > src/backend/access/common/heaptuple.c > heap_modify_tuple(): heap_deform_tuple(..., tupleDesc, ...) > followed by heap_form_tuple(..., tupleDesc, ...) > heap_modify_tuple_by_cols(): heap_deform_tuple(..., tupleDesc, > ...) followed by heap_form_tuple(..., tupleDesc, ...) > src/backend/access/common/tupconvert.c > execute_attr_map_tuple(): heap_deform_tuple() followed by > heap_form_tuple() > src/test/regress/regress.c > make_tuple_indirect(): heap_deform_tuple() followed by > heap_form_tuple() > src/pl/plpgsql/src/pl_exec.c > exec_move_row() called with tupdesc = NULL, SPI_tuptable->tupdesc, > or tupdesc released after the call > contrib/hstore/hstore_io.c > hstore_from_record(), hstore_populate_record(): tupdesc released > after extracting data > > And even if I missed some possibly problematic calls, maybe it's worth to > consider fixing exactly that places... > > Also, besides heap_deform_tuple(), getmissingattr() is called from > heap_getattr(). And heap_getattr() is used in many places, but most of > them > are protected too due to tupdesc pinned. > Two suspicious places that I found are GetAttributeByName() and > GetAttributeByNum(), so when using these functions you can really get the > missing attribute value with the tupdesc released. But what > circumstances do > we need to end up with an invalid data? > 1) Write a function that will call GetAttributeByName() and hold it's > result inside for some time. > 2) Add a passed-by-ref column with a default value to a table. > 3) Pass to that function a tuple without the "missing" column (just > "SELECT func(table) FROM table" won't do, but func(OLD) in a trigger > should). > 4) Trigger a relcache invalidation. > 5) Perform a database access inside that function to process the > invalidation. > 6) Access the result of GetAttributeByName() after that. > > Maybe we could construct such test case, e.g. add to pg_regress one more > SQL-accessible C function, bu9ccvhhihvcnb·t I wonder, is this the way > to go? > > On the other hand, probably there are extensions, that use > GetAttributeByName() in the non-safe way (as the function documentation > doesn't warn about such issues), but maybe just perform datumCopy inside > that function (and similar one(s))? > Though, perhaps I just don't notice the elephant in the core... I was waiting to see if others had a reaction to this, but ... I think the last point (possible unsafe use by extensions) is reason enough to proceed with the proposed mitigation, which is pretty simple and non-invasive.. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
pgsql-bugs by date: