Thread: Shouldn't pg_(sh)seclabel.provider be marked NOT NULL?
Some of my Salesforce colleagues are looking into making every system catalog be declared with a true primary key. They came across the fact that pg_seclabel and pg_shseclabel are declared with unique indexes that include the "provider" column, but that column does not get marked as NOT NULL during initdb. Shouldn't it be? For that matter, it doesn't look to me like the code intends to ever store a null value into the label column either --- should that also be marked NOT NULL? I think we've generally been lazy about marking variable-width catalog columns as NOT NULL (note bootstrap mode will mark fixed-width columns as NOT NULL automatically). I'm not necessarily arguing to try to clean this up altogether, but it would be good I think if indexable columns got marked NOT NULL whenever possible. regards, tom lane
> On Jun 20, 2014, at 10:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Some of my Salesforce colleagues are looking into making every system > catalog be declared with a true primary key. They came across the > fact that pg_seclabel and pg_shseclabel are declared with unique > indexes that include the "provider" column, but that column does not > get marked as NOT NULL during initdb. Shouldn't it be? At some point, I inferred a rule that catalog columns were intended to be either both fixed-width and not nullable; or variable-widthand nullable. I believe the current situation is the result of that inference... but I see no particular reasonnot to change it. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: >> On Jun 20, 2014, at 10:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Some of my Salesforce colleagues are looking into making every system >> catalog be declared with a true primary key. They came across the >> fact that pg_seclabel and pg_shseclabel are declared with unique >> indexes that include the "provider" column, but that column does not >> get marked as NOT NULL during initdb. Shouldn't it be? > At some point, I inferred a rule that catalog columns were intended to > be either both fixed-width and not nullable; or variable-width and > nullable. I believe the current situation is the result of that > inference... but I see no particular reason not to change it. The actual rule that's embodied in the bootstrap code is to mark everything that could potentially be referenced via a C struct field as not nullable: which is to say, fixed-width fields up till we get to the first variable-width one. It's fairly likely that this is *not* marking all the columns that the code expects to be non-null in practice. The idea I'm toying with right now is to additionally mark as not nullable any column referenced in a DECLARE_UNIQUE_INDEX command in catalog/indexing.h. But I've not looked through that set carefully; it's conceivable that we actually have some indexed catalog columns that are allowed to be null. A possibly better solution is to invent a new macro that has the same semantics as DECLARE_UNIQUE_INDEX, plus forcing the columns to be marked NOT NULL. A bigger-picture question is whether there are yet more columns that could be marked not null, and how much we care about making them so. regards, tom lane
Tom Lane wrote: > The idea I'm toying with right now is to additionally mark as not nullable > any column referenced in a DECLARE_UNIQUE_INDEX command in > catalog/indexing.h. But I've not looked through that set carefully; it's > conceivable that we actually have some indexed catalog columns that are > allowed to be null. A possibly better solution is to invent a new macro > that has the same semantics as DECLARE_UNIQUE_INDEX, plus forcing the > columns to be marked NOT NULL. I think most, if not all, the unique indexes declared are part of a syscache. I don't think we allow those to be null, so in effect those columns are already not nullable. Non-unique indexes in indexing.h already bear a standard comment that they are not used for syscache. The only exception was added recently in f01d1ae3a104019: DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops)); -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-20 16:50:15 -0400, Alvaro Herrera wrote: > Tom Lane wrote: > > > The idea I'm toying with right now is to additionally mark as not nullable > > any column referenced in a DECLARE_UNIQUE_INDEX command in > > catalog/indexing.h. But I've not looked through that set carefully; it's > > conceivable that we actually have some indexed catalog columns that are > > allowed to be null. A possibly better solution is to invent a new macro > > that has the same semantics as DECLARE_UNIQUE_INDEX, plus forcing the > > columns to be marked NOT NULL. > > I think most, if not all, the unique indexes declared are part of a > syscache. I don't think we allow those to be null, so in effect those > columns are already not nullable. > Non-unique indexes in indexing.h > already bear a standard comment that they are not used for syscache. > The only exception was added recently in f01d1ae3a104019: > DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops)); There's no NULLs in here. It can have duplicates, but in that it's far from alone. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-20 16:50:15 -0400, Alvaro Herrera wrote: >> Non-unique indexes in indexing.h >> already bear a standard comment that they are not used for syscache. >> The only exception was added recently in f01d1ae3a104019: >> DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops)); > There's no NULLs in here. It can have duplicates, but in that it's far > from alone. I think Alvaro was complaining that it's alone in lacking this comment: /* This following index is not used for a cache and is not unique */ But TBH, I don't think those comments are worth much. I'd rather get rid of them all and instead add an Assert to the cache code enforcing that any index underlying a catcache is unique. It looks like the easiest place to do that is InitCatCachePhase2 --- that's the only place in catcache.c that actually opens the underlying index directly. I'd like to also have an Assert in there that the index columns are marked NOT NULL, but not sure if they actually all are marked that way today. regards, tom lane
On 2014-06-20 17:29:33 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-06-20 16:50:15 -0400, Alvaro Herrera wrote: > >> Non-unique indexes in indexing.h > >> already bear a standard comment that they are not used for syscache. > >> The only exception was added recently in f01d1ae3a104019: > >> DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops)); > > > There's no NULLs in here. It can have duplicates, but in that it's far > > from alone. > > I think Alvaro was complaining that it's alone in lacking this comment: > /* This following index is not used for a cache and is not unique */ > > But TBH, I don't think those comments are worth much. I'd rather get > rid of them all and instead add an Assert to the cache code enforcing > that any index underlying a catcache is unique. It looks like the > easiest place to do that is InitCatCachePhase2 --- that's the only place > in catcache.c that actually opens the underlying index directly. > > I'd like to also have an Assert in there that the index columns are > marked NOT NULL, but not sure if they actually all are marked that > way today. Sounds sensible. If they aren't marking them as such hopefully isn't problematic... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-06-20 16:50:15 -0400, Alvaro Herrera wrote: > > I think most, if not all, the unique indexes declared are part of a > > syscache. I don't think we allow those to be null, so in effect those > > columns are already not nullable. > > Non-unique indexes in indexing.h > > already bear a standard comment that they are not used for syscache. > > The only exception was added recently in f01d1ae3a104019: > > DECLARE_INDEX(pg_class_tblspc_relfilenode_index, 3455, on pg_class using btree(reltablespace oid_ops, relfilenode oid_ops)); > > There's no NULLs in here. It can have duplicates, but in that it's far > from alone. I'm only saying it's missing the /* this index is not unique */ comment that all other DECLARE_INDEX() lines have. Sorry I wasn't clear. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-20 17:29:33 -0400, Tom Lane wrote: >> I think Alvaro was complaining that it's alone in lacking this comment: >> /* This following index is not used for a cache and is not unique */ >> >> But TBH, I don't think those comments are worth much. I'd rather get >> rid of them all and instead add an Assert to the cache code enforcing >> that any index underlying a catcache is unique. It looks like the >> easiest place to do that is InitCatCachePhase2 --- that's the only place >> in catcache.c that actually opens the underlying index directly. >> >> I'd like to also have an Assert in there that the index columns are >> marked NOT NULL, but not sure if they actually all are marked that >> way today. > Sounds sensible. If they aren't marking them as such hopefully isn't > problematic... Experimental result from adding an Assert in CatalogCacheInitializeCache is that it doesn't blow up :-). So we do have them all marked correctly. regards, tom lane