On 28 May 2010, at 20:39, Tom Wilcox wrote:
> out = ''
> for tok in toks:
> ## full word replace
> if tok == 'house' : out += 'hse'+ADDR_FIELD_DELIM
> elif tok == 'ground' : out += 'grd'+ADDR_FIELD_DELIM
> elif tok == 'gnd' : out += 'grd'+ADDR_FIELD_DELIM
> elif tok == 'front' : out += 'fnt'+ADDR_FIELD_DELIM
> elif tok == 'floor' : out += 'flr'+ADDR_FIELD_DELIM
> elif tok == 'floors' : out += 'flr'+ADDR_FIELD_DELIM
Not that it would solve your problems, but you can write the above much more elegantly using a dictionary:
# normalize the token
try:
out += {
'house' : 'hse',
'ground' : 'grd',
'gnd' : 'grd',
'front' : 'fnt',
'floor' : 'flr',
...
}[tok]
except KeyError:
out += tok
# add a field delimiter if the token isn't among the exceptions for those
if tok not in ('borough', 'city', 'of', 'the', 'at', 'incl', 'inc'):
out += ADDR_FIELD_DELIM
You should probably define those lists outside the for-loop though, I'm not sure the Python interpreter is smart enough
todeclare those lists only once otherwise. The concept remains though.
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.
!DSPAM:737,4c00531510211149731783!