Thread: hiding normalization with views
I have dealt with postgresql for the last couple of years with a very simple database, but now I'm trying to do something a bit more complicated and am having some trouble. Here's a situation: I am wanting to manage a (postal) mailing list. Nearly everyone lives in the same few cities, so to store things efficiently (and minimize typos, etc) I have a few tables: Table "customers" Column | Type | Modifiers ----------------+---------+----------- customer_id | integer | creation_date | date | name | text | not null bill_address | text | bill_city | integer | bill_state | integer | bill_zip | integer | alt_address | text | alt_city | integer | alt_state | integer | alt_zip | integer | phone | text | alt_phone | text | contact_name | text | Unique keys: customers_customer_id_key Table "cities" Column | Type | Modifiers --------+---------+----------- id | integer | city | text | and also "states" and "zips" that look just like "cities". The problem is, whenever someone looks at or updates the data, I want "customers.billing_city" and "cities.id" to be hidden and automatically created when needed. I can create a view that shows only the data I want, but that doesn't help for INSERTs and UPDATEs. Can anyone point me in the right direction or give an example?
Josh, I am assuming that data gets entered into the customers table and then should automagically update the cities, states, and zip_codes tables. For example, if a new record or update contains the city "Chicago" and Chicago doesn't exist in cities, it should be added/inserted. There might be a way to do this by setting up integrity constraints ( i don't know how), but you could also do it in code. When a record is inserted/updated in the customers table, do a select with the city value in the cities table to see if the city exists. if the city doesn't exist, insert it. Do the same with states and zip codes. This is some work and like I said, you might able to do it integrity constraints, but we need some help from the postgres gurus to see if that's possible. regards, Devinder Rajput Stores Division Corporate Offices Chicago, IL (773) 442-6474 "Joshua Daniel Franklin" To: pgsql-novice@postgresql.org <joshua@iocc.com> cc: Sent by: Subject: [NOVICE] hiding normalization with views pgsql-novice-owner@post gresql.org 10/08/2002 12:26 PM I have dealt with postgresql for the last couple of years with a very simple database, but now I'm trying to do something a bit more complicated and am having some trouble. Here's a situation: I am wanting to manage a (postal) mailing list. Nearly everyone lives in the same few cities, so to store things efficiently (and minimize typos, etc) I have a few tables: Table "customers" Column | Type | Modifiers ----------------+---------+----------- customer_id | integer | creation_date | date | name | text | not null bill_address | text | bill_city | integer | bill_state | integer | bill_zip | integer | alt_address | text | alt_city | integer | alt_state | integer | alt_zip | integer | phone | text | alt_phone | text | contact_name | text | Unique keys: customers_customer_id_key Table "cities" Column | Type | Modifiers --------+---------+----------- id | integer | city | text | and also "states" and "zips" that look just like "cities". The problem is, whenever someone looks at or updates the data, I want "customers.billing_city" and "cities.id" to be hidden and automatically created when needed. I can create a view that shows only the data I want, but that doesn't help for INSERTs and UPDATEs. Can anyone point me in the right direction or give an example? ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Thanks, doing a SELECT to see if the city (etc.) was there is exactly what I am trying to avoid. Gurus, any comments? On Tue, 8 Oct 2002, Devinder K Rajput wrote: > > Josh, > > I am assuming that data gets entered into the customers table and then > should automagically update the cities, states, and zip_codes tables. For > example, if a new record or update contains the city "Chicago" and Chicago > doesn't exist in cities, it should be added/inserted. There might be a way > to do this by setting up integrity constraints ( i don't know how), but you > could also do it in code. When a record is inserted/updated in the > customers table, do a select with the city value in the cities table to see > if the city exists. if the city doesn't exist, insert it. Do the same > with states and zip codes. This is some work and like I said, you might > able to do it integrity constraints, but we need some help from the > postgres gurus to see if that's possible. > > regards, > > Devinder Rajput
Devinder, Josh, > I am assuming that data gets entered into the customers table and then > should automagically update the cities, states, and zip_codes tables. For > example, if a new record or update contains the city "Chicago" and Chicago > doesn't exist in cities, it should be added/inserted. There might be a way > to do this by setting up integrity constraints ( i don't know how), but you > could also do it in code. When a record is inserted/updated in the > customers table, do a select with the city value in the cities table to see > if the city exists. if the city doesn't exist, insert it. Do the same > with states and zip codes. This is some work and like I said, you might > able to do it integrity constraints, but we need some help from the > postgres gurus to see if that's possible. Devinder is correct. In fact, this is largely what the PostgreSQL RULES system is for. One can define a "FOR INSERT" RULE on a view, and then program it in a procedural language on what to do with all of the data that belongs in sub-tables. See: http://www.us.postgresql.org/users-lounge/docs/7.2/postgres/rules.html ALternately, you can do what I do in most of my applications, which is to update data through functions or middleware rather than calling direct INSERT/UPDATE statements. This allows me to implement and maintain a large array of complex business rules that would be difficult, or at least not centrally managed, if I used RULES. For example, instead of having the database client call: INSERT INTO users ( first_name, last_name, logon, password ) VALUES ( $fname, $lname, $logon, $pwd ); it calls SELECT fn_add_user ( $fname, $lname, $logon, $pwd ); Which approach you take (RULES, Functions, or 3rd-party middleware) depends largely on your applicaiton specification; none is intrinsically superior. -- Josh Berkus josh@agliodbs.com Aglio Database Solutions San Francisco
On Tue, 2002-10-08 at 12:26, Joshua Daniel Franklin wrote: > I have dealt with postgresql for the last couple of years > with a very simple database, but now I'm trying to do something > a bit more complicated and am having some trouble. Here's a situation: > > I am wanting to manage a (postal) mailing list. Nearly everyone > lives in the same few cities, so to store things efficiently (and > minimize typos, etc) I have a few tables: > > Table "customers" > Column | Type | Modifiers > ----------------+---------+----------- > customer_id | integer | > creation_date | date | > name | text | not null > bill_address | text | > bill_city | integer | > bill_state | integer | > bill_zip | integer | > alt_address | text | > alt_city | integer | > alt_state | integer | > alt_zip | integer | > phone | text | > alt_phone | text | > contact_name | text | > Unique keys: customers_customer_id_key > > Table "cities" > Column | Type | Modifiers > --------+---------+----------- > id | integer | > city | text | > > and also "states" and "zips" that look just like "cities". > The problem is, whenever someone looks at or updates the data, I > want "customers.billing_city" and "cities.id" to be hidden and > automatically created when needed. I can create a view > that shows only the data I want, but that doesn't help for INSERTs > and UPDATEs. > > Can anyone point me in the right direction or give an example? 3 *slightly* off topic comments: Sometimes, you can go too "normalization happy". Yes, it's academically correct, and you save a little disk space, but disks are cheap, and you add bunches of extra joins to each query. Since there are only 50 states, plus territories and Canada, changing [bill|alt]_state to CHAR(2) is a good idea. What are [bill|alt_zip? Are they designed to store the zip code, or point to a zip code table? I think I'd change them to CHAR(10) (so that Plus-4 and Canadian postal codes will fit). -- +------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "they love our milk and honey, but preach about another | | way of living" | | Merle Haggard, "The Fighting Side Of Me" | +------------------------------------------------------------+
On 8 Oct 2002, Ron Johnson wrote: > Sometimes, you can go too "normalization happy". Yes, it's academically > correct, and you save a little disk space, but disks are cheap, and > you add bunches of extra joins to each query. Yes, I thought about that, too. The big thing to me is that with the system we currently use (flat files) there are quite a lot of typos. For the city/state this is not a big deal but the ZIP code it is. We did a mailing around 6 months ago with about 50 return-to-senders many of which were stupid typos. I don't think the normalization will solve all my problems, but it will at least be something. Also, just for the discussion, we're also talking about instead of a separate table just having the (java) client reading a configuration file with the common values for autocompletion.
Forgive my ignorance if I'm incorrect but don't you need to do a select on your cities, zips, etc. to populate the id into you customers data anyway? Therefore you would already know if you have the city, zip, .... Also, if you have only a "few" cities, it may be more efficient to store the city definition which includes the state, zip etc. and only have 1 id to normalize you data in the customer table. I work for a transportation company and have over 140,000 cities defined. For speed I'd rather do 1 query to get the city information I need rather than 3 or 4 queries. On Tuesday 08 October 2002 12:48 pm, you wrote: > Thanks, doing a SELECT to see if the city (etc.) was there is > exactly what I am trying to avoid. > > Gurus, any comments? > > On Tue, 8 Oct 2002, Devinder K Rajput wrote: > > Josh, > > > > I am assuming that data gets entered into the customers table and then > > should automagically update the cities, states, and zip_codes tables. > > For example, if a new record or update contains the city "Chicago" and > > Chicago doesn't exist in cities, it should be added/inserted. There > > might be a way to do this by setting up integrity constraints ( i don't > > know how), but you could also do it in code. When a record is > > inserted/updated in the customers table, do a select with the city value > > in the cities table to see if the city exists. if the city doesn't > > exist, insert it. Do the same with states and zip codes. This is some > > work and like I said, you might able to do it integrity constraints, but > > we need some help from the postgres gurus to see if that's possible. > > > > regards, > > > > Devinder Rajput > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Ron Johnson wrote: > Since there are only 50 states, plus territories and Canada, > changing [bill|alt]_state to CHAR(2) is a good idea. ... only a good idea if you know the customer base is North America only. Outside the USA, CHAR(2) is a dubious choice. Probably doesn't matter in this instance. :-) Giles