Thread: paradigm sanity check needed [long]
i'd like to run our concept up the flagpole and see what kind of shots go plumb through it-- you know how the standard recommendation is to have every field across the database with its own unique name? right? like school_student_certificate.school_student_certificate_year, long_enough_to_choke_a_horse.long_enough_to_choke_a_horse_thing... well, we're not doing that. not even close. all the literature says it's a good idea, but they don't say why. and we want to find out. :) here goes-- EVERY table of ours has *.id serial primary key, *.created date default current_date, *.modified timestamp(0), *.by references person(id), and when one table links to another (aside from the "edited by field above, that is) the name of the foreign key matches that of the table whose id we're after: team.person => person.id team.org => org.id dates.date_types => date_types.id loc.state => state.id loc.nation => nation.id project.org => org.id ...etc... programmatically -- at first glance -- this looks like a DREAM to manage. (this is how we know there must be something seriously wrong with it. red alert!) using this setup, we can easily link up, using sql, which tables refer (defer) to others -- detecting a parent/child relationship is a snap: -- here's the main "subroutine" used below: CREATE VIEW sys_tables as SELECT c.relname AS class, oid AS c_oid FROM pg_class c WHERE c.relname !~ '^pg_' -- not a postgres system table AND c.relname !~ '^_' -- not a table-behind-a-view AND c.relkind IN ('v','r') -- view or relation/table ORDER BY c.relname ; COMMENT ON COLUMN sys_tables.class is 'System Table (class) name'; then once we specify our lookup/validation table pattern, we can figure out the rest: -- linking tables (for many-to-many) -- subtables (many-to-one) -- data tables (the nacells of the enterprise) for the validation/lookup tables, they're all named *_TYPE -- and then there's the other two: "state" and "nation"; we have some special-case hard-coding for those two: -- VALIDATION TABLES (having *_TYPE names) CREATE VIEW sys_lookup_tables as SELECT t.class FROM sys_tables t WHERE ( t.class ~ '_type$' ) OR ( -- don't forget these! t.class IN ('state','nation') -- special cases ) ORDER BY t.class ; COMMENT ON COLUMN sys_lookup_tables.class is 'Name of verification/lookup table'; if we eventually add others, this is the ONE place we'd have to mention them. which normal form is THAT? :) this next view shows ALL subsidiary relationships, both subtables and linking (many-to-many) tables: -- SUB TABLES (TABLES having field names matching other table names) CREATE VIEW sys_sub_tables as SELECT s.class as sub, p.class as parent FROM pg_attribute a, sys_tables s, sys_tables p WHERE a.attrelid=s.c_oid -- field from Sub AND a.attname=p.class -- matches table Parent ORDER BY s.class, p.class ; COMMENT ON COLUMN sys_sub_tables.sub is 'Subsidiary table name'; COMMENT ON COLUMN sys_sub_tables.parent is 'Parent table name'; and here, we find the linking tables: CREATE VIEW sys_link_tables as SELECT s.sub as class FROM sys_sub_tables s left join sys_lookup_tables l on (l.class = s.parent) WHERE -- they're subtables, but not lookup tables l.class is null GROUP BY s.sub HAVING count(*) > 1 ; COMMENT ON COLUMN sys_link_tables.class is 'Name of linking (many-to-many) table'; (explain select * from sys_link_tables -- quite amusing!) pulling field descriptors out (comment) is pretty easy, too. (internationalization is a long way off, of course...) what do y'all think? "man the lifeboats"? should we be expecting sea water in the scuppers anytime soon? and why (or why not)... -- There are 10 kinds of people: ones that get binary, and ones that don't. will@serensoft.com http://sourceforge.net/projects/newbiedoc -- we need your brain! http://www.dontUthink.com/ -- your brain needs us! Looking for a firewall? Do you think smoothwall sucks? You're probably right... Try the folks at http://clarkconnect.org/ !
Hi, I played around with a similar paradigm shift for some time (except for the lookup tables, which I prefer to keep seperate). Its true that it would be ALICE IN WONDERLAND -- only if it were practical. what we hit against is SQL syntax... (just syntax, not conceptual clash) Now, SQL lets you spell out your fields when you use them, such as : person.id, person.created, ... Thats a good thing. BUT, the column names that are OUTPUT from a SELECT dont have the table name prefixed. Which can be a good or a bad thing, depending... Consider a query like this: SELECT * FROM person; The column names you will get out of this will not have the table name prefixed. Even the following wont work as you sometimes wished it did: SELECT person.id, person.created FROM person; It is on the otherhand possible to use AS with each and every column name : SELECT person.id AS person.id , person.created AS person.created FROM person; That should have the prefixes in... What a waste of finger energy though ... Besides, it wont ever work with the * (yeah I know they say * is a bad thing, but I dont think it is in all situations). When you have JOINS, its even worse... What is really needed is an other SELECT option (which could also be just an extension to AS) that will cause the output to have the origin table name prefixed. Maybe something like : SELECT person.* AS person.* FROM person; Or maybe better... SELECT * QUALIFIED FROM person; which could then be : SELECT person.* QUALIFIED, department.* QUALIFIED FROM person, department WHERE person.department = department.id; OR may be just a GLOBAL option for the entire SELECT, such as : SELECT QUALIFIED person.*, department.* FROM person, department WHERE person.department = department.id; Further name qualification that exteds to the SCHEMA and DATABASE (why not UNIVERSE? --:) could be opted with : SELECT [TABLE|SCHEMA|DATABASE] QUALIFIED * FROM person; Anyway, I am sure someone else could come up with a better syntax, but you get the idea ... Cheers, Ayhan -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of will trillich Sent: samedi 1 février 2003 05:28 To: pgsql-general@postgresql.org Subject: [GENERAL] paradigm sanity check needed [long] i'd like to run our concept up the flagpole and see what kind of shots go plumb through it-- you know how the standard recommendation is to have every field across the database with its own unique name? right? like school_student_certificate.school_student_certificate_year, long_enough_to_choke_a_horse.long_enough_to_choke_a_horse_thing... well, we're not doing that. not even close. all the literature says it's a good idea, but they don't say why. and we want to find out. :) here goes-- EVERY table of ours has *.id serial primary key, *.created date default current_date, *.modified timestamp(0), *.by references person(id), and when one table links to another (aside from the "edited by field above, that is) the name of the foreign key matches that of the table whose id we're after: team.person => person.id team.org => org.id dates.date_types => date_types.id loc.state => state.id loc.nation => nation.id project.org => org.id ...etc... programmatically -- at first glance -- this looks like a DREAM to manage. (this is how we know there must be something seriously wrong with it. red alert!) using this setup, we can easily link up, using sql, which tables refer (defer) to others -- detecting a parent/child relationship is a snap: -- here's the main "subroutine" used below: CREATE VIEW sys_tables as SELECT c.relname AS class, oid AS c_oid FROM pg_class c WHERE c.relname !~ '^pg_' -- not a postgres system table AND c.relname !~ '^_' -- not a table-behind-a-view AND c.relkind IN ('v','r') -- view or relation/table ORDER BY c.relname ; COMMENT ON COLUMN sys_tables.class is 'System Table (class) name'; then once we specify our lookup/validation table pattern, we can figure out the rest: -- linking tables (for many-to-many) -- subtables (many-to-one) -- data tables (the nacells of the enterprise) for the validation/lookup tables, they're all named *_TYPE -- and then there's the other two: "state" and "nation"; we have some special-case hard-coding for those two: -- VALIDATION TABLES (having *_TYPE names) CREATE VIEW sys_lookup_tables as SELECT t.class FROM sys_tables t WHERE ( t.class ~ '_type$' ) OR ( -- don't forget these! t.class IN ('state','nation') -- special cases ) ORDER BY t.class ; COMMENT ON COLUMN sys_lookup_tables.class is 'Name of verification/lookup table'; if we eventually add others, this is the ONE place we'd have to mention them. which normal form is THAT? :) this next view shows ALL subsidiary relationships, both subtables and linking (many-to-many) tables: -- SUB TABLES (TABLES having field names matching other table names) CREATE VIEW sys_sub_tables as SELECT s.class as sub, p.class as parent FROM pg_attribute a, sys_tables s, sys_tables p WHERE a.attrelid=s.c_oid -- field from Sub AND a.attname=p.class -- matches table Parent ORDER BY s.class, p.class ; COMMENT ON COLUMN sys_sub_tables.sub is 'Subsidiary table name'; COMMENT ON COLUMN sys_sub_tables.parent is 'Parent table name'; and here, we find the linking tables: CREATE VIEW sys_link_tables as SELECT s.sub as class FROM sys_sub_tables s left join sys_lookup_tables l on (l.class = s.parent) WHERE -- they're subtables, but not lookup tables l.class is null GROUP BY s.sub HAVING count(*) > 1 ; COMMENT ON COLUMN sys_link_tables.class is 'Name of linking (many-to-many) table'; (explain select * from sys_link_tables -- quite amusing!) pulling field descriptors out (comment) is pretty easy, too. (internationalization is a long way off, of course...) what do y'all think? "man the lifeboats"? should we be expecting sea water in the scuppers anytime soon? and why (or why not)... -- There are 10 kinds of people: ones that get binary, and ones that don't. will@serensoft.com http://sourceforge.net/projects/newbiedoc -- we need your brain! http://www.dontUthink.com/ -- your brain needs us! Looking for a firewall? Do you think smoothwall sucks? You're probably right... Try the folks at http://clarkconnect.org/ ! ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
discussion on using this field-naming mechanism: TOPTABLE: UNDERTABLE: id -- pkey id -- pkey data data otherdata DATTABLE -- points to toptable.id instead of the usual undertable.undertable_id and undertable.toptable_id -- On Sat, Feb 01, 2003 at 04:40:39PM +0100, Ayhan Ulusoy wrote: > I played around with a similar paradigm shift for some > time (except for the lookup tables, which I prefer to keep > seperate). > > Its true that it would be ALICE IN WONDERLAND -- only if it > were practical. > > what we hit against is SQL syntax... (just syntax, not > conceptual clash) hmm. show me what you mean-- > Now, SQL lets you spell out your fields when you use them, > such as : person.id, person.created, ... Thats a good > thing. > > BUT, the column names that are OUTPUT from a SELECT dont > have the table name prefixed. Which can be a good or a bad > thing, depending... > > Consider a query like this: > SELECT * FROM person; > > The column names you will get out of this will not have the > table name prefixed. i don't yet see the evil there. YOU specified which table, so you ALREADY KNOW what the table is. > Even the following wont work as you sometimes wished it did: > > SELECT person.id, person.created FROM person; > > It is on the otherhand possible to use AS with each and > every column name : > > SELECT person.id AS person.id , person.created AS > person.created FROM person; > > That should have the prefixes in... What a waste of finger > energy though ... why is it crucial to have the table.* prefix if you're selecting from one table? > When you have JOINS, its even worse... so your objection is primarily in the joins, then: select project.id, org.id -- error: can't result in two "id" fields! from project, org where project.org = org.id ; hmm, yes. badness that way. i can see doing select -- other fields, project.id AS PROJECT_ID, org.id AS ORG_ID -- "wasted^H^H^H^H^H^Hinvested finger effort" from project, org where project.org = org.id ; to get a list of ALL the projects and all their related enterprises; then we just split/_/ to get table.field back. but more commonly we will want projects for a certain institution, so we'll already know the org.id-- select -- other fields, project.id from project, org where project.org = org.id AND ORG.ID = $THIS_VALUE ; here since we know which org we're after, we might only be interested in the project id's related to it. is there other nefariousity lurking in there somewhere? i can also see creating views for each "these_from_those" relation which would keep that part of the logic out of the middle-layer of the application: create view project_from_org as select p.* from project p, org o where o.id = p.org ; select * from project_from_org where org=$some_org_id; and if we somehow forget which org this was, we can look in the resulting "org" field pulled from the project table. === the drawback i see in using ORG PROJECT org_id project_id org_data org_id is that you're just about going to have to hard-code every crosslink (we *know* project.org_id links to org.org_id) or do a bunch of split/_/ anyhow (when fieldname ends "_id" find table mentioned in before underscore then refer to its field of the same name). no? in some ways it's six-of-one-and-a-half-dozeon-of-the-other. but in other ways the "tablename.linkedtable_id" approach seems a hair more tedious than "tablename.linkedtable [as id]" one. -- There are 10 kinds of people: ones that get binary, and ones that don't. will@serensoft.com http://sourceforge.net/projects/newbiedoc -- we need your brain! http://www.dontUthink.com/ -- your brain needs us! Looking for a firewall? Do you think smoothwall sucks? You're probably right... Try the folks at http://clarkconnect.org/ !
On Fri, 31 Jan 2003 22:30:45 -0600, will trillich wrote: > i'd like to run our concept up the flagpole and see what kind of shots > go plumb through it-- > > you know how the standard recommendation is to have every field across > the database with its own unique name? right? > > like school_student_certificate.school_student_certificate_year, > long_enough_to_choke_a_horse.long_enough_to_choke_a_horse_thing... > > well, we're not doing that. not even close. all the literature says it's > a good idea, but they don't say why. and we want to find out. :) > > here goes-- > > EVERY table of ours has > *.id serial primary key, > *.created date default current_date, *.modified timestamp(0), *.by > references person(id), > > and when one table links to another (aside from the "edited by field > above, that is) the name of the foreign key matches that of the table > whose id we're after: > > team.person => person.id > team.org => org.id > dates.date_types => date_types.id > loc.state => state.id > loc.nation => nation.id > project.org => org.id > ...etc... > > It's pretty much up to you and your organization. What's most maintainable for you? Theory only goes so far--that's why we work so hard to normalize relations only to struggle to find ways of de-normalizing to views. Programatically, logically, using your naming scheme is just that: a naming scheme. I personally like it, and I really *don't* like typing 60-character names (or even 30, with an alias, or whatever). What you name your tables and columns should represent the data accurately. That way, when you hire a consultant, you don't have to pay him for the time he (or she) takes figuring out an arcane and non-representative naming scheme. And your successors will bless your name for intelligently naming you fields. :) As far as the data is concerned, Postgresql doesn't care what you call it. I think as long as you avoid keywords you'll be alright. Just my opinion, I could be wrong. ;) -- Matthew Vanecek perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);' ******************************************************************************** For 93 million miles, there is nothing between the sun and my shadow except me. I'm always getting in the way of something...
On Fri, Jan 31, 2003 at 11:33:35PM -0600, pgsql-general-owner+M37147=will=serensoft.com@postgresql.org wrote: > On Fri, 31 Jan 2003 22:30:45 -0600, will trillich wrote: > What you name your tables and columns should represent the data > accurately. That way, when you hire a consultant, you don't have to pay > him for the time he (or she) takes figuring out an arcane and > non-representative naming scheme. And your successors will bless your > name for intelligently naming you fields. :) > > As far as the data is concerned, Postgresql doesn't care what you call it. > I think as long as you avoid keywords you'll be alright. true. you know how many time i wish i could use a "date" field? or "on"? or "table"? :( (without having to use the blasted quotes, of course...) -- There are 10 kinds of people: ones that get binary, and ones that don't. will@serensoft.com http://sourceforge.net/projects/newbiedoc -- we need your brain! http://www.dontUthink.com/ -- your brain needs us! Looking for a firewall? Do you think smoothwall sucks? You're probably right... Try the folks at http://clarkconnect.org/ !