Thread: Unsafe use of relation->rd_options without checking its type
I looked into bug#14397. The submitter was discourteous enough not to provide an in-line example, but here it is: CREATE TABLE IF NOT EXISTS TEST_1 ( ID SERIAL PRIMARY KEY, C1 BYTEA, C2 TEXT NOT NULL, C3 BOOLEAN NOT NULL DEFAULT FALSE, CONSTRAINT TEST_1_unique_idx UNIQUE(C1, C2) create or replace view test as select * from test_1 with cascaded check option; insert into test (c1, c2, c3) values (decode('MTIzAAE=', 'base64'), 'text', true) on conflict (c1, c2) do update set c3=excluded.c3; ERROR: ON CONFLICT is not supported on table "test" used as a catalog table LINE 1: ...lues (decode('MTIzAAE=', 'base64'), 'text', true) on conflic... ^ The reason for the error is that transformOnConflictArbiter applies RelationIsUsedAsCatalogTable() to something that it doesn't know to be a plain relation --- it's a view in this case. And that macro blindly assumes that relation->rd_options is a StdRdOptions struct, when in this case it's actually a ViewOptions struct. Now that I've seen this I wonder which other uses of rd_options are potentially broken. RelationIsUsedAsCatalogTable() is hardly the only macro that is assuming with little justification that it's applied to the right kind of reloptions. We could band-aid this by having the RelationIsUsedAsCatalogTable() macro check relation->relkind, but I'm inclined to think that the right fix going forward is to insist that StdRdOptions, ViewOptions, and the other in-memory representations of reloptions ought to be self-identifying somehow. We could add a field to them similar to nodeTag, but one of the things that was envisioned to begin with is relation types that have StdRdOptions as the first part of a larger struct. I'm not sure how to make that work with a tag. Thoughts? regards, tom lane
On Mon, Oct 31, 2016 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Now that I've seen this I wonder which other uses of rd_options are > potentially broken. RelationIsUsedAsCatalogTable() is hardly the > only macro that is assuming with little justification that it's > applied to the right kind of reloptions. It seems worth adding an assertion, at least. I wonder what running the regression tests with a bunch of similar assertions shows up... -- Peter Geoghegan
^
The reason for the error is that transformOnConflictArbiter applies
RelationIsUsedAsCatalogTable() to something that it doesn't know to
be a plain relation --- it's a view in this case. And that macro
blindly assumes that relation->rd_options is a StdRdOptions struct,
when in this case it's actually a ViewOptions struct.
Now that I've seen this I wonder which other uses of rd_options are
potentially broken. RelationIsUsedAsCatalogTable() is hardly the
only macro that is assuming with little justification that it's
applied to the right kind of reloptions.
Right, there are many macros, which blindly assume the rd_options
to be of specific type. Here is the list of such macros :
RelationGetFillFactor
RelationIsUsedAsCatalogTable
RelationGetParallelWorkers
RelationIsSecurityView
RelationHasCheckOption
RelationHasLocalCheckOption
RelationHasCascadedCheckOption
BrinGetPagesPerRange
GinGetUseFastUpdate
GinGetPendingListCleanupSize
For the macros associated with specific index type, it might be alright to
assume the type of the rd_options because those have very limited usage.
However for the rest of the macros, the code does not limit its usage. This can
lead to problems as you described above .
We could band-aid this by having the RelationIsUsedAsCatalogTable()
macro check relation->relkind, but I'm inclined to think that the
right fix going forward is to insist that StdRdOptions, ViewOptions,
and the other in-memory representations of reloptions ought to be
self-identifying somehow. We could add a field to them similar to
nodeTag, but one of the things that was envisioned to begin with
is relation types that have StdRdOptions as the first part of a
larger struct. I'm not sure how to make that work with a tag.
Thoughts?
Yes, it seems appropriate to tag all types of the rd_options with an
identification and maybe check for that identification within each macro.
Currently, there are following types of rd_options as far as I could find:
GiSTOptions
BrinOptions
GinOptions
StdRdOptions
ViewOptions
BloomOptions (from contrib)
I am not clear on how the identification field in the above structures can
be a problem, for the relations have StdRdOptions as the first part of a
larger structure.
Regards,
Neha
I wrote: > Now that I've seen this I wonder which other uses of rd_options are > potentially broken. RelationIsUsedAsCatalogTable() is hardly the > only macro that is assuming with little justification that it's > applied to the right kind of reloptions. > > We could band-aid this by having the RelationIsUsedAsCatalogTable() > macro check relation->relkind, ... I've pushed a hack along those lines, so that we could fix the reported problem in the back branches. > ... but I'm inclined to think that the > right fix going forward is to insist that StdRdOptions, ViewOptions, > and the other in-memory representations of reloptions ought to be > self-identifying somehow. We could add a field to them similar to > nodeTag, but one of the things that was envisioned to begin with > is relation types that have StdRdOptions as the first part of a > larger struct. I'm not sure how to make that work with a tag. After a bit of thought, I propose the following conventions: 1. All decoded reloptions structs (anything that Relation.rd_options could point to) shall embed these common header fields: typedef struct BaseRdOptions { int32 vl_len_; /* varlena header (do not touch directly!) */ int options_id; /* code identifyingspecific reloptions */ /* useful data follows */ } BaseRdOptions; We'll keep the rule that these structs have a varlena length word, since having the struct size stored that way allows easy copying with datumCopy. 2. Basic reloptions structs that share no payload fields with anything else will be assigned options_id values in the range 1..255. 3. A struct type that wants to extend, say, StdRdOptions with some extra fields would use an options_id that has StdRdOptions's code in its low order byte and a more specific identity in the next higher byte. Similarly, one could extend it again with some identity bits placed in the third byte. There could be up to three levels of nesting before we run out of space in the ID word, and that seems like probably enough. 4. So if you want to test whether a particular options struct has StdRdOptions fields, you would need to execute a test like "(rel->rd_options->options_id & 0xFF) == STDRDOPTIONS_ID". This could and should be embedded in a standard test macro, of course. 5. The end goal would be to have all functions/macros that access rd_options include explicit checks that the data is what they expect, eg instead of naively doing this: #define RelationGetFillFactor(relation, defaultff) ((relation)->rd_options ? ((StdRdOptions *) (relation)->rd_options)->fillfactor: (defaultff)) we should do this: #define RelationGetFillFactor(relation, defaultff) ((relation)->rd_options && IsStdRdOptions((relation)->rd_options)? ((StdRdOptions *) (relation)->rd_options)->fillfactor : (defaultff)) Unless somebody has an objection or better idea, I'll push forward with making this happen in HEAD. regards, tom lane