Home > mailing lists

Re: Normalization and regexp - Mailing list pgsql-general

From	ezra epstein
Subject	Re: Normalization and regexp
Date	December 24, 2003 20:53:04
Msg-id	eYKdnUVGRbArtXeiXTWc-w@speakeasy.net Whole thread Raw
In response to	Normalization and regexp (MT <m_tessier@sympatico.ca>)
List	pgsql-general

Tree view

There's no reason you couldn't store a regex string into a column.  The
trick has only to do with proper escaping for the various steps, namely the
loading method (regular SQL has some escaping requirements and so you need
double back-slashes and escaped quotes).  A work-around for this is to use
other input methods like COPY.  Then there's the escaping required for the
regex tool of choice: either client-side or server(db)-side.

If you're doing the regex processing client-side this might be easier.
Depends on usage.

In short, the principal of what you propose seems completely sound.
However, there might be tricks to handling character escaping correctly.

== Ezra Epstein

"MT" <m_tessier@sympatico.ca> wrote in message
news:20031222181239.476cfa47.m_tessier@sympatico.ca...
> Hi,
>
> Since Canada Post hasn't programmed their automated web tools for
calculating transportation costs to work with anything outside of Windows
Internet Explorer, I'm obliged to write a web based program optimized for
all those *other* browsers, myself. Part of this task requires that I set up
tables in postgresql that match postal codes to transportation cost.
>
> Canada Post provides a booklet for calculating transportation cost based
on package weight and the first three characters of the postal code
sequence. For instance, if I want to send a package to an address that
includes G8F 1X1 as the postal code, I take the first 3 characters G8F and
look them up in table 1.
>
> Table 1
>
> PostalCode    Tarrif number
> ---------------------------
> G4V               14
> G8E-G8G           14
> G4R-G4S           13
>
> Since G8F falls in the G8E-G8G range, I now know that the tarrif number is
14. Taking the number 14, I go to table 2
>
> Table 2
>
>    For tarrif Code 14
> Weight(kg)       Price
> ----------------------
>   1.0            5.37
>   1.5            5.61
>   2.0            5.82
>
> If the weight of my package is 1kg, the price is 5.37 to send the package
to the address bearing G8F as the first 3 characters of the postal code.
>
> To render this in the database, I have done the following:
>
> _____________________________
>            p_code            |
> =============================
>   pcode_id   |     tarrif    |
> -----------------------------
>     G4V      |       14      |
> -----------------------------
>     G8E      |       14      |
> -----------------------------
>     G8F      |       14      |
> -----------------------------
>     G8G      |       14      |
> -----------------------------
>     G4R      |       13      |
> -----------------------------
>     G4S      |       13      |
> -----------------------------
>
> __________________________________
>            price_weight           |
> ==================================
>    tarrif   | weight(kg)|  price  |
> ----------------------------------
>      14     |    1.0    |   5.37  |
> ----------------------------------
>      14     |    1.5    |   5.61  |
> ----------------------------------
>      14     |    2.0    |   5.82  |
> ----------------------------------
>      13     |    1.0    |   5.20  |
> ----------------------------------
>      13     |    1.5    |   5.32  |
> ----------------------------------
>      13     |    2.0    |   5.42  |
>
>
> Therefore my sql statement would look something like this:
>
> SELECT price FROM price_weight
> WHERE p_code.tarrif = price_weight.tarrif
> AND pcode = 'G8F'
> AND weight = '1.0';
>
> I think this will work, but before I proceed, I'd like to ask 2 questions:
>
> 1.
> Is it possible to further normalize the data in the p_code and
price_weight tables above?
>
> 2.
> Is it possible to abbreviate the number of records in the p_code table
using regular expressions. For instance, to avoid repetition, I thought I'd
use regular expressions, so that instead of entering the postal code into
separate rows as such:
>
> G4V 14
> G8E 14
> G8F 14
> G8G 14
>
> I could do something like this:
>
> (G4V | G8[E-G]) 14
>
> Somehow I don't think this is possible, but I'm looking for any way to
minimize the number of postal codes that I have to enter, since there's a
lot of them.
>
> Anyway, I realize these questions may have more to do with database design
than postgresql per se. If there's a better place to ask them, please point
me in the right direction.
>
> Thanks,
>
> Mark
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

pgsql-general by date:

From: "ezra epstein"
Date: 24 December 2003, 20:53:02
Subject: Re: SELECT INTO broken (?) in PL/pgSQL when using a set returning function in FROM clause (BGUG?)

From: jrpogo@yahoo.com (Jeffrey Pogodzinski)
Date: 24 December 2003, 20:53:08
Subject: Postgresql 7.4.1 Compile problem for SCO 5.0.5

Re: Normalization and regexp - Mailing list pgsql-general

Previous

Next