Thread: Normalization and regexp
Hi, Since Canada Post hasn't programmed their automated web tools for calculating transportation costs to work with anythingoutside of Windows Internet Explorer, I'm obliged to write a web based program optimized for all those *other* browsers,myself. Part of this task requires that I set up tables in postgresql that match postal codes to transportationcost. Canada Post provides a booklet for calculating transportation cost based on package weight and the first three charactersof the postal code sequence. For instance, if I want to send a package to an address that includes G8F 1X1 as thepostal code, I take the first 3 characters G8F and look them up in table 1. Table 1 PostalCode Tarrif number --------------------------- G4V 14 G8E-G8G 14 G4R-G4S 13 Since G8F falls in the G8E-G8G range, I now know that the tarrif number is 14. Taking the number 14, I go to table 2 Table 2 For tarrif Code 14 Weight(kg) Price ---------------------- 1.0 5.37 1.5 5.61 2.0 5.82 If the weight of my package is 1kg, the price is 5.37 to send the package to the address bearing G8F as the first 3 charactersof the postal code. To render this in the database, I have done the following: _____________________________ p_code | ============================= pcode_id | tarrif | ----------------------------- G4V | 14 | ----------------------------- G8E | 14 | ----------------------------- G8F | 14 | ----------------------------- G8G | 14 | ----------------------------- G4R | 13 | ----------------------------- G4S | 13 | ----------------------------- __________________________________ price_weight | ================================== tarrif | weight(kg)| price | ---------------------------------- 14 | 1.0 | 5.37 | ---------------------------------- 14 | 1.5 | 5.61 | ---------------------------------- 14 | 2.0 | 5.82 | ---------------------------------- 13 | 1.0 | 5.20 | ---------------------------------- 13 | 1.5 | 5.32 | ---------------------------------- 13 | 2.0 | 5.42 | Therefore my sql statement would look something like this: SELECT price FROM price_weight WHERE p_code.tarrif = price_weight.tarrif AND pcode = 'G8F' AND weight = '1.0'; I think this will work, but before I proceed, I'd like to ask 2 questions: 1. Is it possible to further normalize the data in the p_code and price_weight tables above? 2. Is it possible to abbreviate the number of records in the p_code table using regular expressions. For instance, to avoidrepetition, I thought I'd use regular expressions, so that instead of entering the postal code into separate rows assuch: G4V 14 G8E 14 G8F 14 G8G 14 I could do something like this: (G4V | G8[E-G]) 14 Somehow I don't think this is possible, but I'm looking for any way to minimize the number of postal codes that I have toenter, since there's a lot of them. Anyway, I realize these questions may have more to do with database design than postgresql per se. If there's a better placeto ask them, please point me in the right direction. Thanks, Mark
In order to reduce the number of rows in the p_code table, you could modify it so that it had the following columns: create table p_code ( pCode_Start varchar(3), pCode_End varchar(3), tariff integer, primary key(pCode_Start, pCode_End) ); If you need to insert a single code, use the following: insert into pCode('G4V','G4V',14); and a range as follows: insert into pCode('G8E', 'G8G',14); Your select statement is now: SELECT price FROM price_weight pw, p_code pc WHERE pc.tarrif = pw.tarrif AND pCode_Start <= 'G8F' AND pCode_End >= 'G8F' AND weight = '1.0'; I think that should work OK... John Sidney-Woollett MT said: > Hi, > > Since Canada Post hasn't programmed their automated web tools for > calculating transportation costs to work with anything outside of Windows > Internet Explorer, I'm obliged to write a web based program optimized for > all those *other* browsers, myself. Part of this task requires that I set > up tables in postgresql that match postal codes to transportation cost. > > Canada Post provides a booklet for calculating transportation cost based > on package weight and the first three characters of the postal code > sequence. For instance, if I want to send a package to an address that > includes G8F 1X1 as the postal code, I take the first 3 characters G8F and > look them up in table 1. > > Table 1 > > PostalCode Tarrif number > --------------------------- > G4V 14 > G8E-G8G 14 > G4R-G4S 13 > > Since G8F falls in the G8E-G8G range, I now know that the tarrif number is > 14. Taking the number 14, I go to table 2 > > Table 2 > > For tarrif Code 14 > Weight(kg) Price > ---------------------- > 1.0 5.37 > 1.5 5.61 > 2.0 5.82 > > If the weight of my package is 1kg, the price is 5.37 to send the package > to the address bearing G8F as the first 3 characters of the postal code. > > To render this in the database, I have done the following: > > _____________________________ > p_code | > ============================= > pcode_id | tarrif | > ----------------------------- > G4V | 14 | > ----------------------------- > G8E | 14 | > ----------------------------- > G8F | 14 | > ----------------------------- > G8G | 14 | > ----------------------------- > G4R | 13 | > ----------------------------- > G4S | 13 | > ----------------------------- > > __________________________________ > price_weight | > ================================== > tarrif | weight(kg)| price | > ---------------------------------- > 14 | 1.0 | 5.37 | > ---------------------------------- > 14 | 1.5 | 5.61 | > ---------------------------------- > 14 | 2.0 | 5.82 | > ---------------------------------- > 13 | 1.0 | 5.20 | > ---------------------------------- > 13 | 1.5 | 5.32 | > ---------------------------------- > 13 | 2.0 | 5.42 | > > > Therefore my sql statement would look something like this: > > SELECT price FROM price_weight > WHERE p_code.tarrif = price_weight.tarrif > AND pcode = 'G8F' > AND weight = '1.0'; > > I think this will work, but before I proceed, I'd like to ask 2 questions: > > 1. > Is it possible to further normalize the data in the p_code and > price_weight tables above? > > 2. > Is it possible to abbreviate the number of records in the p_code table > using regular expressions. For instance, to avoid repetition, I thought > I'd use regular expressions, so that instead of entering the postal code > into separate rows as such: > > G4V 14 > G8E 14 > G8F 14 > G8G 14 > > I could do something like this: > > (G4V | G8[E-G]) 14 > > Somehow I don't think this is possible, but I'm looking for any way to > minimize the number of postal codes that I have to enter, since there's a > lot of them. > > Anyway, I realize these questions may have more to do with database design > than postgresql per se. If there's a better place to ask them, please > point me in the right direction. > > Thanks, > > Mark > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
There's no reason you couldn't store a regex string into a column. The trick has only to do with proper escaping for the various steps, namely the loading method (regular SQL has some escaping requirements and so you need double back-slashes and escaped quotes). A work-around for this is to use other input methods like COPY. Then there's the escaping required for the regex tool of choice: either client-side or server(db)-side. If you're doing the regex processing client-side this might be easier. Depends on usage. In short, the principal of what you propose seems completely sound. However, there might be tricks to handling character escaping correctly. == Ezra Epstein "MT" <m_tessier@sympatico.ca> wrote in message news:20031222181239.476cfa47.m_tessier@sympatico.ca... > Hi, > > Since Canada Post hasn't programmed their automated web tools for calculating transportation costs to work with anything outside of Windows Internet Explorer, I'm obliged to write a web based program optimized for all those *other* browsers, myself. Part of this task requires that I set up tables in postgresql that match postal codes to transportation cost. > > Canada Post provides a booklet for calculating transportation cost based on package weight and the first three characters of the postal code sequence. For instance, if I want to send a package to an address that includes G8F 1X1 as the postal code, I take the first 3 characters G8F and look them up in table 1. > > Table 1 > > PostalCode Tarrif number > --------------------------- > G4V 14 > G8E-G8G 14 > G4R-G4S 13 > > Since G8F falls in the G8E-G8G range, I now know that the tarrif number is 14. Taking the number 14, I go to table 2 > > Table 2 > > For tarrif Code 14 > Weight(kg) Price > ---------------------- > 1.0 5.37 > 1.5 5.61 > 2.0 5.82 > > If the weight of my package is 1kg, the price is 5.37 to send the package to the address bearing G8F as the first 3 characters of the postal code. > > To render this in the database, I have done the following: > > _____________________________ > p_code | > ============================= > pcode_id | tarrif | > ----------------------------- > G4V | 14 | > ----------------------------- > G8E | 14 | > ----------------------------- > G8F | 14 | > ----------------------------- > G8G | 14 | > ----------------------------- > G4R | 13 | > ----------------------------- > G4S | 13 | > ----------------------------- > > __________________________________ > price_weight | > ================================== > tarrif | weight(kg)| price | > ---------------------------------- > 14 | 1.0 | 5.37 | > ---------------------------------- > 14 | 1.5 | 5.61 | > ---------------------------------- > 14 | 2.0 | 5.82 | > ---------------------------------- > 13 | 1.0 | 5.20 | > ---------------------------------- > 13 | 1.5 | 5.32 | > ---------------------------------- > 13 | 2.0 | 5.42 | > > > Therefore my sql statement would look something like this: > > SELECT price FROM price_weight > WHERE p_code.tarrif = price_weight.tarrif > AND pcode = 'G8F' > AND weight = '1.0'; > > I think this will work, but before I proceed, I'd like to ask 2 questions: > > 1. > Is it possible to further normalize the data in the p_code and price_weight tables above? > > 2. > Is it possible to abbreviate the number of records in the p_code table using regular expressions. For instance, to avoid repetition, I thought I'd use regular expressions, so that instead of entering the postal code into separate rows as such: > > G4V 14 > G8E 14 > G8F 14 > G8G 14 > > I could do something like this: > > (G4V | G8[E-G]) 14 > > Somehow I don't think this is possible, but I'm looking for any way to minimize the number of postal codes that I have to enter, since there's a lot of them. > > Anyway, I realize these questions may have more to do with database design than postgresql per se. If there's a better place to ask them, please point me in the right direction. > > Thanks, > > Mark > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >