Thread: ID column naming convention
Hi, is there a problem calling ID's different when used as a FK vs table ID? For example mydimtable () ID name description myfacttable () my_dim_id # FK to ID above total_sales I 'think' if I don't enforce foreign key constraints, then this practice prevents tools from being able to generate ERD diagrams right? -- View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 10/13/2015 11:36 AM, droberts wrote:
Pretty sure _all_ ER diagramming tools rely on the DDL of column constraints not the names. And personally I would name the column mydim_id. i.e don't inject the extra underscore which is not in your table name. (And I assume the table names are fake: you don't really want "table" in table name)Hi, is there a problem calling ID's different when used as a FK vs table ID? For example mydimtable ()IDnamedescription myfacttable () my_dim_id # FK to ID above total_sales I 'think' if I don't enforce foreign key constraints, then this practice prevents tools from being able to generate ERD diagrams right? -- View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
I 'think' if I don't enforce foreign key constraints, then this practice
prevents tools from being able to generate ERD diagrams right?
Yes, if you don't name them the same and don't setup an actual FK then there is no reason to assume that two random fields on separate tables are in any way related.
I suspect Rob is right that even naming them the same isn't enough - but that would depend upon the tool. Given the, IMO misguided, prevalence of naming every PK fields "id" about the only meaningful logic such a tool could employ would be to look for fields of the form "table_id" and if "table" has an "id" field assume that they are indeed related. Since your example use "my_dim" for a field that exists on the "mydimtable" even this logic would be insufficient to guess the link you know exists but don't encode into the schema.
Personally, I don't care whether it is a PK or FK - ID fields are database unique and when present always refer to the same entity. In this case I would call the field "mydim_id" on both tables. I would expect to have to define UNIQUE (PRIMARY KEY) and FOREIGN KEY constraints if I wanted tools to understand the relationship between the two tables even though I standardized the name of the ID field.
There are implications to choosing any particular naming convention. I'm not familiar with any that are so severe that I would call them problems.
David J.
On 14/10/15 06:36, droberts wrote: > Hi, is there a problem calling ID's different when used as a FK vs table ID? > For example > > > mydimtable () > ID > name > description > > > myfacttable () > my_dim_id # FK to ID above > total_sales > > > I 'think' if I don't enforce foreign key constraints, then this practice > prevents tools from being able to generate ERD diagrams right? > > > > -- > View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > My practice is to name the PRIMARY KEY as id, and foreign keys with the original table name plus the sufiix_id. By leaving the table name off the primary key name, and just using id, makes it more obvious that it is a primary key (plus it seems redundant to prefix the primary key name with its own table name!). CREATE TABLE house ( id int PRIMARY KEY, address text ); CREATE TABLE room ( id int PRIMARY KEY, house_id int REFERENCES house(id), name text ); There are exceptions like: CREATE TABLE human ( id int PRIMARY KEY, mother_id int REFERENCES human (id), father_id int REFERENCES human (id), name text ); Cheers, Gavin
Gavin Flower-2 wrote > On 14/10/15 06:36, droberts wrote: >> Hi, is there a problem calling ID's different when used as a FK vs table >> ID? >> For example >> >> >> mydimtable () >> ID >> name >> description >> >> >> myfacttable () >> my_dim_id # FK to ID above >> total_sales >> >> >> I 'think' if I don't enforce foreign key constraints, then this practice >> prevents tools from being able to generate ERD diagrams right? >> >> >> >> -- >> View this message in context: >> http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html >> Sent from the PostgreSQL - general mailing list archive at Nabble.com. >> >> > My practice is to name the PRIMARY KEY as id, and foreign keys with the > original table name plus the sufiix_id. > > By leaving the table name off the primary key name, and just using id, > makes it more obvious that it is a primary key (plus it seems redundant > to prefix the primary key name with its own table name!). > > CREATE TABLE house > ( > id int PRIMARY KEY, > address text > ); > > CREATE TABLE room > ( > id int PRIMARY KEY, > house_id int REFERENCES house(id), > name text > ); > > > There are exceptions like: > > CREATE TABLE human > ( > id int PRIMARY KEY, > mother_id int REFERENCES human (id), > father_id int REFERENCES human (id), > name text > ); > > Cheers, > Gavin > > > -- > Sent via pgsql-general mailing list ( > pgsql-general@ > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general Thanks. My only question is how do you create a schema diagram (ERD) then? The tool won't know what the relationships are unless maybe you put foreign key constraints on. BTW does anyone recommend a tool to to that? I've been playing with DbVisualizer. -- View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844p5869881.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 10/13/2015 3:27 PM, droberts wrote: > Thanks. My only question is how do you create a schema diagram (ERD) then? > The tool won't know what the relationships are unless maybe you put foreign > key constraints on. BTW does anyone recommend a tool to to that? I've been > playing with DbVisualizer. I don't know of any ERD tool that will recognize foreign key references without FK constraint definitions, regardless of what things are named. -- john r pierce, recycling bits in santa cruz
On Tue, Oct 13, 2015 at 03:44:15PM -0700, John R Pierce wrote: > On 10/13/2015 3:27 PM, droberts wrote: > >Thanks. My only question is how do you create a schema diagram (ERD) then? > >The tool won't know what the relationships are unless maybe you put foreign > >key constraints on. BTW does anyone recommend a tool to to that? I've been > >playing with DbVisualizer. > > > I don't know of any ERD tool that will recognize foreign key references > without FK constraint definitions, regardless of what things are named. Also, there wouldn't be anything to recognize. Only to assume. Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 10/13/15 2:34 PM, Gavin Flower wrote: >> > My practice is to name the PRIMARY KEY as id, and foreign keys with the > original table name plus the sufiix_id. > > By leaving the table name off the primary key name, and just using id, > makes it more obvious that it is a primary key (plus it seems redundant > to prefix the primary key name with its own table name!). There's two things that are ugly about that though: Joins become MUCH easier to screw up. When you have 5 different fields that are all called 'id' it's trivial to mix them up. It's much harder to accidentally do something like 'blah.person_id = foo.invoice_id'. The other issue is common to all "bare word" names (id, name, description, etc): it becomes completely impossible to find all occurrences of something in code. If you grep your entire codebase for 'person_id', you know you'll find exactly what you want. Grepping for 'id' OTOH would be useless. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On 16/10/15 13:09, Jim Nasby wrote: > On 10/13/15 2:34 PM, Gavin Flower wrote: >>> >> My practice is to name the PRIMARY KEY as id, and foreign keys with the >> original table name plus the sufiix_id. >> >> By leaving the table name off the primary key name, and just using id, >> makes it more obvious that it is a primary key (plus it seems redundant >> to prefix the primary key name with its own table name!). > > There's two things that are ugly about that though: > > Joins become MUCH easier to screw up. When you have 5 different fields > that are all called 'id' it's trivial to mix them up. It's much harder > to accidentally do something like 'blah.person_id = foo.invoice_id'. > > The other issue is common to all "bare word" names (id, name, > description, etc): it becomes completely impossible to find all > occurrences of something in code. If you grep your entire codebase for > 'person_id', you know you'll find exactly what you want. Grepping for > 'id' OTOH would be useless. It would seem to be very dodgy to us a join based on apparently very different semantic implied by 'blah.person_id = foo.invoice_id'!!! :-) Because 2 fields in different tables have the same name, it does not necessarily mean they have the same semantics. For example 2 tables could have a field named 'start_date', but the one in a table called 'employment' would have different semantics to the one in 'project'. Since 'id' is only used to indicate a PRIMARY KEY, there is less confusion in joins, and it is clear when something is a foreign key rather than a PRIMARY KEY. For example, if two tables both refer to the same human, you can join using a.human_id = b.human_id - and it is clearer when you are joining a child to a parent table, for example line_item.stock_id = stock.id. Adopting you convention, it would result in not only picking up foreign key references, but also the primary keys - which may, or may not, too helpful! It would be very rare to have a join such as project.id = task.id, it is usually a mistake to join tables on their primary key - so using just 'id' as the PRIMARY KEY name is a bonus. I once devised a stored procedure in SyBase with over 3,000 lines of SQL (I would have broken it up in smaller units, but it was not practicable in that development environment). It had 7 temporary tables, 5 used 'id' as the PRIMARY KEY - and 2 used the name of the PRIMARY KEY of an existing table ('tcs_id' & 'perorg_seq'), because that made more sense, as they had the the same semantic meaning. I did not design the 2 databases I queried, but I suspect sometimes I might decide it best to use something other than just 'id' - but it would be very rare (I won't say never!) that I'd use the table name as a prefix for the primary key. Searching on a bare word names can be useful when the fields have similar, related semantics. In a real database, I'd be very unlikely to use 'name' for a field, though using 'description' might be valid. Though in general, I would agree that using several words in a name is normally preferable. Also it would also be better to define appropriate DOMAINs rather than just using bare types like 'text' & 'int' - to better document the semantics and make it easier to change things in a more controlled way. If one was grepping for the occurrences of the use of the PRIMARY KEY of the table human, you would look for 'human_id' you would only grep for 'id' if one wanted to find the use of PRIMARY KEYs. No naming convention is perfect in all situations, and I'll adapt mine as appropriate. In my experience, my convention (well to be honest, I adopted it from others - so I can't claim to have originated it!) seems to be better in general. Essentially it is a guideline, I won't insist that you have have your computers confiscated if you use a different convention!
On Fri, Oct 16, 2015 at 02:28:25PM +1300, Gavin Flower wrote: > Since 'id' is only used to indicate a PRIMARY KEY, there is less confusion > in joins, and it is clear when something is a foreign key rather than a > PRIMARY KEY. Given that "id" often has meaning outside the database I much prefer naming my primary keys "pk". And foreign keys "fk_TABLENAME": line_item.pk_stock = stock.pk Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 10/15/15 8:28 PM, Gavin Flower wrote: > It would seem to be very dodgy to us a join based on apparently very > different semantic implied by 'blah.person_id = foo.invoice_id'!!! :-) > > Because 2 fields in different tables have the same name, it does not > necessarily mean they have the same semantics. For example 2 tables > could have a field named 'start_date', but the one in a table called > 'employment' would have different semantics to the one in 'project'. Right, which is why the fields should be called employment_start_date and project_start_date, not just start_date. > Since 'id' is only used to indicate a PRIMARY KEY, there is less > confusion in joins, and it is clear when something is a foreign key > rather than a PRIMARY KEY. For example, if two tables both refer to the > same human, you can join using a.human_id = b.human_id - and it is > clearer when you are joining a child to a parent table, for example > line_item.stock_id = stock.id. > > Adopting you convention, it would result in not only picking up foreign > key references, but also the primary keys - which may, or may not, too > helpful! It generally shouldn't matter, because a person_id is *always* a person_id. In cases where it does matter then the field name alone probably won't help you much, if at all. Of course, this has been a debate forever, so I know neither of us will convince the other to change. ;) I just wanted to point out some things that hadn't come up already. BTW, I found Karsten's idea of using 'pk' for the surrogate key, and fk_table_name interesting. It helps avoid ambiguity from externally generated ID values. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Fri, Oct 16, 2015 at 04:59:12PM -0500, Jim Nasby wrote: > BTW, I found Karsten's idea of using 'pk' for the surrogate key, and > fk_table_name interesting. It helps avoid ambiguity from externally > generated ID values. That's the point :-) Here's a real live schema using (mostly) the above approach: http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/ in case anyone is interested in taking a look. Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
> On Oct 13, 2015, at 18:27, droberts <david.roberts@riverbed.com> wrote: > > Gavin Flower-2 wrote >>> On 14/10/15 06:36, droberts wrote: >>> Hi, is there a problem calling ID's different when used as a FK vs table >>> ID? >>> For example >>> >>> >>> mydimtable () >>> ID >>> name >>> description >>> >>> >>> myfacttable () >>> my_dim_id # FK to ID above >>> total_sales >>> >>> >>> I 'think' if I don't enforce foreign key constraints, then this practice >>> prevents tools from being able to generate ERD diagrams right? >>> >>> >>> >>> -- >>> View this message in context: >>> http://postgresql.nabble.com/ID-column-naming-convention-tp5869844.html >>> Sent from the PostgreSQL - general mailing list archive at Nabble.com. >> My practice is to name the PRIMARY KEY as id, and foreign keys with the >> original table name plus the sufiix_id. >> >> By leaving the table name off the primary key name, and just using id, >> makes it more obvious that it is a primary key (plus it seems redundant >> to prefix the primary key name with its own table name!). >> >> CREATE TABLE house >> ( >> id int PRIMARY KEY, >> address text >> ); >> >> CREATE TABLE room >> ( >> id int PRIMARY KEY, >> house_id int REFERENCES house(id), >> name text >> ); >> >> >> There are exceptions like: >> >> CREATE TABLE human >> ( >> id int PRIMARY KEY, >> mother_id int REFERENCES human (id), >> father_id int REFERENCES human (id), >> name text >> ); >> >> Cheers, >> Gavin >> >> >> -- >> Sent via pgsql-general mailing list ( > >> pgsql-general@ > >> ) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > Thanks. My only question is how do you create a schema diagram (ERD) then? > The tool won't know what the relationships are unless maybe you put foreign > key constraints on. That's how most tools work, usually by calling the driver api (jdbc databasemetadata, etc....) which in turn look at theinformation_schema. If you don't setup real referential integrity, any tool that can use names is just guessing I think dbvisualizer will 'infer' based on column names. I KNOW that schemaspy has this option, but they explicitly noteit is a GUESS. Use foreign keys. > BTW does anyone recommend a tool to to that? I've been > playing with DbVisualizer. > > > > -- > View this message in context: http://postgresql.nabble.com/ID-column-naming-convention-tp5869844p5869881.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On 18/10/15 00:13, Karsten Hilbert wrote: > On Fri, Oct 16, 2015 at 04:59:12PM -0500, Jim Nasby wrote: > >> BTW, I found Karsten's idea of using 'pk' for the surrogate key, and >> fk_table_name interesting. It helps avoid ambiguity from externally >> generated ID values. > That's the point :-) > > Here's a real live schema using (mostly) the above approach: > > http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/ > > in case anyone is interested in taking a look. > > Karsten Hmm... Apparently (according to your naming convention) several tables (such as 'clin.substance_intake') have 2 PRIMARY KEYs!!! I guess you must have a large wall to display the schema on! Looks like a lot of fun. Can you tell us more about the database (ignoring nit-picking!)? Cheers, Gavin
> > Here's a real live schema using (mostly) the above approach: > > > > http://www.gnumed.de/~ncq/gnumed/schema/gnumed_v20/ > > > > in case anyone is interested in taking a look. > > > > Karsten > > Apparently (according to your naming convention) several tables (such as > 'clin.substance_intake') have 2 PRIMARY KEYs!!! You've picked perhaps the most complex table in there :-) > Can you tell us more about the database (ignoring nit-picking!)? Sure. All the clinical tables inherit from clin.clin_root_item which itself has a primary key (pk_item) and which, in turn, inherits from audit.audit_fields, which, again, has a primary key of its own (pk_audit). That is what makes several tables appear to have more than one primary key. However, only the top level serial column is declaratively set to be the actual primary key of any one table. Certainly, hose columns all _do_ have the properties of primary keys. Inheritance from clin.clin_root_item is leveraged for cross-table search of clinical narrative, for one thing. That table is also overly complex as it needs to take account of patients being documented to be on treatment regimes with a mixture of either branded drugs, possibly containing several active substances, and generic substances without a documented brand. That's why it is overly complex (I haven't found a better way to implement this requirement). Anything else you'd like to know ? Karsten
On 10/18/15 4:32 AM, Karsten Hilbert wrote: >> Can you tell us more about the database (ignoring nit-picking!)? > Sure. All the clinical tables inherit from clin.clin_root_item > which itself has a primary key (pk_item) and which, in turn, > inherits from audit.audit_fields, which, again, has a primary > key of its own (pk_audit). That is what makes several tables > appear to have more than one primary key. FWIW, I intentionally break some of my conventions when using inheritance. For example, I normally use a single sequence to drive ALL surrogate keys for the tables, and child tables keep the same surrogate key as the parent (which means naming doesn't match). IMHO it makes more sense to have inheritance do the work for you on this stuff. BTW, if there's any interest, I have code that sets up a non-inheritance table specifically for doing foreign keys to the inheritance parent. It doesn't support updates right now, but should be fully safe from a FK standpoint. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Mon, Oct 19, 2015 at 04:25:39AM -0500, Jim Nasby wrote: > BTW, if there's any interest, I have code that sets up a non-inheritance > table specifically for doing foreign keys to the inheritance parent. It > doesn't support updates right now, but should be fully safe from a FK > standpoint. Can you tell me more ? Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 10/22/15 11:37 AM, Karsten Hilbert wrote: > On Mon, Oct 19, 2015 at 04:25:39AM -0500, Jim Nasby wrote: > >> BTW, if there's any interest, I have code that sets up a non-inheritance >> table specifically for doing foreign keys to the inheritance parent. It >> doesn't support updates right now, but should be fully safe from a FK >> standpoint. > > Can you tell me more ? I create a '_fk' table that looks like this: > CREATE TABLE _lead.lead_fk( > lead_id int NOT NULL PRIMARY KEY > , organic_lead_id int CONSTRAINT organic_lead_id_sanity CHECK( organic_lead_id IS NULLOR organic_lead_id = lead_id ) > , some_lead_provider_lead_id int CONSTRAINT some_lead_provider_lead_id_sanity CHECK( some_lead_provider_lead_idIS NULL OR some_lead_provider_lead_id = lead_id ) > -- TODO , CHECK( only one blah_lead_id field is NOT NULL using count_nulls() from PGXN ) > ); Then each table that inherits from lead.lead (as well as lead.lead itself) has: > , FOREIGN KEY (lead_id) REFERENCES _lead.lead_fk DEFERRABLE INITIALLY DEFERRED That FK is in place to ensure that when a lead record is inserted, a corresponding row is inserted into _lead.lead_fk as well. That insert is handled by this trigger function: CREATE OR REPLACE FUNCTION _lead.tg_lead_fk( ) RETURNS trigger LANGUAGE plpgsql -- !!!!!!!!! SECURITY DEFINER SET search_path = pg_catalog -- !!!!!!!!! AS $body$ BEGIN EXECUTE format( $$INSERT INTO _lead.lead_fk( lead_id, %I ) VALUES( $1, $1 )$$ , TG_TABLE_NAME || '_id' ) USING NEW.lead_id ; RETURN NEW; END $body$; Finally, a table that needs to have a FK to a lead has > , lead_id int NOT NULL REFERENCES _lead.lead_fk I also have the following in a pgTap test function to verify that the FK exists on all children of the lead.lead table. FOR r IN SELECT * FROM cat_tools.pg_class_v WHERE reloid = 'lead.lead'::regclass UNION ALL SELECT c.* FROM pg_inherits i JOIN cat_tools.pg_class_v c ON reloid = inhrelid WHERE inhparent = 'lead.lead'::regclass LOOP RETURN NEXT col_is_pk( r.relschema , r.relname , array[ 'lead_id' ] , 'lead_id is PK' ); RETURN NEXT fk_ok( r.relschema , r.relname , 'lead_id' , '_lead' , 'lead_fk' , 'lead_id' ); END LOOP; At some point I'll turn this into metacode so that setting all of this up is just a function call. I just haven't gotten to it yet. (Though, if someone wanted to pay me to do that... ;P ) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com