Thread: Tidying values on variable instantiation

Tidying values on variable instantiation

From
"Bath, David"
Date:
Folks,

Preamble:
* I can create a check constraint on a column or domain that enforces "no leading or trailing whitespace".  Imagine
thatthe domain is called "trimmed_varchar"
 
* I can create plpgsql function/triggers that "tidy" up incoming varchars, trimming the offending whitespaces, on a
columnby column basis.
 
* When creating a column based on a domain with the check constraint, I cannot "tidy it up" during a
pre-insert/pre-updatetrigger.  Fair enough.
 
* I'm only asking about this because I am a long-in-the-tooth Oracle guy, and Pg seems to have many *very* nice
features,and there might be an *elegant* way to achieve this that I cannot attempt in Oracle.
 

Desired Outcome(s):
* I would like to have the convenience of declaring a column that obeys a constraint (similar to using a domain), but
allowsa "tidy-up" as the value is created BEFORE asserting the constraint.  This *might* be termed a "domain trigger".
(Perhapseven a WORM is possible!).
 
* I would like to able to declare columns as  "trimmed_varchar(n)".
* I'd like to be able to use the same approach for other "weak domains".

Question(s):
* Am I being realistic, or should I grit my teeth and clone code from trigger to trigger and column to column?
* Is this something I should try and do using domains, types and cast functions from "text" or some horrible
combinationof them all?
 
* Has anybody got a code sample that might do something similar.

Apologies if I have missed something obvious in the manual, or if it is
a well-known item in the wish-lists, but I am very new to serious Pg
work, and have a tight schedule to do deliver a schema. *sigh*

Thanks in advance
-- 
David T. Bath
dave.bath@unix.net



Re: Tidying values on variable instantiation

From
Michael Glaesemann
Date:
On Aug 26, 2005, at 12:04 PM, Bath, David wrote:

> Desired Outcome(s):
> * I would like to have the convenience of declaring a column that
> obeys
>   a constraint (similar to using a domain), but allows a "tidy-up"
> as the
>   value is created BEFORE asserting the constraint.  This *might* be
>   termed a "domain trigger".  (Perhaps even a WORM is possible!).
> * I would like to able to declare columns as
>   "trimmed_varchar(n)".
> * I'd like to be able to use the same approach for other "weak
> domains".


Unfortuantely, I don't know of a way to obtain your desired outcomes,
but perhaps can offer a couple of ideas that you haven't mentioned
(though you may have already thought of them and discarded them as
undesireable. In that case, my apologies :)

Perhaps rather doing this with a trigger and having the table take
care of it, you could use pl functions to handle the inserts, so
instead of using INSERT directly, you could call the
insert_into_table_foo function. The insert_into_table_foo function
would clean up the input and then call INSERT. A disadvantage of this
is that you'll need to write one of these for each table, though
there are some who handle a lot of their inserts, updates, etc, via
pl functions rather than calling the INSERT and UPDATE commands
directly.

Another option would be to have a separate cleaning function (e.g.,
clean_foo() )for each "type" you want, and then call it with
something like INSERT INTO bar (baz, bat, foo) values (232,
'2005-02-20', clean_foo('protofoo')); This has the advantage that you
just need to write one function for each type (rather than each
table), but you'll have to remember to call it.

While I can understand your motivation, I personally think this kind
of operation is best left in the application layer (which includes
such insert functions) rather than the DDL.

Just my ¥2.

Michael Glaesemann
grzm myrealbox com




Re: Tidying values on variable instantiation

From
"Russell Simpkins"
Date:
> Desired Outcome(s):
> * I would like to have the convenience of declaring a column that obeys
>  a constraint (similar to using a domain), but allows a "tidy-up" as the
>  value is created BEFORE asserting the constraint.  This *might* be
>  termed a "domain trigger".  (Perhaps even a WORM is possible!).
> * I would like to able to declare columns as
>  "trimmed_varchar(n)".
> * I'd like to be able to use the same approach for other "weak domains".

I'm not sure these any easier way to do this than with tirggers.

If the daunting task would writing a large number of triggers, I would write 
sql or php to generate all the triggers. Remember that you can introspect 
the db using the system catalogs. I've had a lot of good results generating 
triggers and sql using Middlegen and Velocity.


Re: Tidying values on variable instantiation

From
"Greg Patnude"
Date:
IMHO: It's not necessarily the job of the RDBMS to be responsible for
formatting and cleaning of your data... This is a job better suited for the
application layer and the data model... 

The RDBMS should only be responsible for enforcing constraints on the
data... not validating or purifying the data... 

Data validation and purification should be performed at the application
layer -- you should format your data appropriately BEFORE trying any
INSERT/UPDATE operations.



-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org]
On Behalf Of Bath, David
Sent: Thursday, August 25, 2005 8:04 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] Tidying values on variable instantiation

Folks,

Preamble:
* I can create a check constraint on a column or domain that enforces "no leading or trailing whitespace".  Imagine
thatthe domain is called "trimmed_varchar"
 
* I can create plpgsql function/triggers that "tidy" up incoming varchars, trimming the offending whitespaces, on a
columnby column basis.
 
* When creating a column based on a domain with the check constraint, I cannot "tidy it up" during a
pre-insert/pre-updatetrigger.  Fair enough.
 
* I'm only asking about this because I am a long-in-the-tooth Oracle guy, and Pg seems to have many *very* nice
features,and there might be an *elegant* way to achieve this that I cannot attempt in Oracle.
 

Desired Outcome(s):
* I would like to have the convenience of declaring a column that obeys a constraint (similar to using a domain), but
allowsa "tidy-up" as the value is created BEFORE asserting the constraint.  This *might* be termed a "domain trigger".
(Perhapseven a WORM is possible!).
 
* I would like to able to declare columns as  "trimmed_varchar(n)".
* I'd like to be able to use the same approach for other "weak domains".

Question(s):
* Am I being realistic, or should I grit my teeth and clone code from trigger to trigger and column to column?
* Is this something I should try and do using domains, types and cast functions from "text" or some horrible
combinationof them all?
 
* Has anybody got a code sample that might do something similar.

Apologies if I have missed something obvious in the manual, or if it is
a well-known item in the wish-lists, but I am very new to serious Pg
work, and have a tight schedule to do deliver a schema. *sigh*

Thanks in advance
-- 
David T. Bath
dave.bath@unix.net


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to      choose an index scan if your joining column's
datatypesdo not      match
 


Re: Tidying values on variable instantiation

From
Bruno Wolff III
Date:
On Fri, Aug 26, 2005 at 13:04:10 +1000,
> Desired Outcome(s):
> * I would like to have the convenience of declaring a column that obeys
>   a constraint (similar to using a domain), but allows a "tidy-up" as the
>   value is created BEFORE asserting the constraint.  This *might* be
>   termed a "domain trigger".  (Perhaps even a WORM is possible!).
> * I would like to able to declare columns as 
>   "trimmed_varchar(n)".
> * I'd like to be able to use the same approach for other "weak domains".
> 
> Question(s):
> * Am I being realistic, or should I grit my teeth and clone code from
>   trigger to trigger and column to column?
> * Is this something I should try and do using domains, types and
>   cast functions from "text" or some horrible combination of them all?
> * Has anybody got a code sample that might do something similar.

I think it is normal to expect the application to pass you clean data.

I think you can do what you want by creating a new type. I seem to remember
there are issues with creating whatever(n) types (my memory is that varchar(n)
is hardwired into the parser), but certainly you could make an alternate
version of text whose input function trimmed leading and trailing whitespace.
You could also create casts between this new type and text if you needed
that ability as well.


Re: Tidying values on variable instantiation

From
Chris Browne
Date:
gpatnude@hotmail.com ("Greg Patnude") writes:
> Data validation and purification should be performed at the
> application layer -- you should format your data appropriately
> BEFORE trying any INSERT/UPDATE operations.

It seems to me that one might create some stored functions that can do
some validation/purification which, by virtue of residing in the
database, have the ability to efficiently access other data in order
to do data-based validation.

I know I have found that to be a useful approach...
-- 
"cbbrowne","@","ntlug.org"
http://cbbrowne.com/info/linuxdistributions.html
"The day Microsoft  makes something that doesn't suck  is probably the
day they start making vacuum cleaners" - Ernst Jan Plugge


Re: Tidying values on variable instantiation

From
Chris Browne
Date:
dave.bath@unix.net ("Bath, David") writes:
> Question(s):
> * Am I being realistic, or should I grit my teeth and clone code from
>   trigger to trigger and column to column?
> * Is this something I should try and do using domains, types and
>   cast functions from "text" or some horrible combination of them all?
> * Has anybody got a code sample that might do something similar.

If what you are looking for is an API that "cleans things up," then
I'd suggest the thought of building a stored function API, and using
*that* to do the work instead of hitting tables directly.

In the 'domain registry' area, I have done this very sort of thing,
mostly oriented towards data conversions.

We have, as major objects, domains, contacts, and nameservers.

I have some functions, create_domain(), create_contact(), and
create_nameserver(), where the stored procs receive a very limited set
of parameters (in comparison, at least, with the total number of
attributes associated with the respective sets of tables).

Relevant to the thread, create_contact() does a lot of "data
cleansing" in order to try to perform the Procrustean task of forcing
telephone numbers into a particular "standard form."

That approach has proved very useful.

If you have some well-defined set of actions that you want to perform
on the objects in your system, defining a stored function for each
action gives a good way of centralizing the "cleanup" parts.

We have separate status tables; I could have defined triggers to try
to manage them; it seemed more sensible to instead handle that in the
stored procs.
-- 
"cbbrowne","@","acm.org"
http://www3.sympatico.ca/cbbrowne/oses.html
Rules of the Evil Overlord #69.  "All midwives will be banned from the
realm.    All   babies   will    be   delivered    at   state-approved
hospitals. Orphans  will be placed  in foster-homes, not  abandoned in
the   woods    to   be   raised    by   creatures   of    the   wild."
<http://www.eviloverlord.com/>