T E Schmitz wrote:
> Gary Stainburn wrote:
>> On Friday 07 July 2006 14:51, T E Schmitz wrote:
>>
>>> I would like to split the contents of a column using substring with a
>>> regular expression:
>>>
>>> The column contains something like
>>> "150mm LD AD Asp XR Macro"
>>> I want to split this into
>>> "150mm", "LD AD Asp XR Macro"
>>>
>>
>>
>> select substring('150mm LD AD Asp XR Macro','^\\d+mm') as BASE_NAME,
>> substring('150mm LD AD Asp XR Macro','^\\d+mm (.*)$') as SUFFIX;
>> base_name | suffix
>> -----------+--------------------
>> 150mm | LD AD Asp XR Macro
>> (1 row)
>>
>> The brackets surround the required match
>
>
> This is ingenious! I had been looking at chapter 9.6 Pattern Matching.
> Am I missing something? I did not realize that the brackets indicate
> the required match.
>
> But that takes me to the next problem:
>
> For the sake of the example I simplified the regular pattern.
> In reality, BASE_NAME might be:
>
> 28mm
> 28-70mm
>
> So the reg. expr. requires brackets:
>
> substring (NAME, '^(\\d+(-\\d+)?mm)' ) as BASE_NAME
>
> Actually, the pattern is more complex than that and I cannot see how I
> can express it without brackets.
>
>
Will the mm always be the end of the base name?
substring(NAME, '^(.*?mm)') as BASE_NAME
That should match all the way up to the first mm. If there are actually
a set number of different units that it might end in, such as mm and cm,
you can do:
substring(NAME, '^(.*?(mm|cm))') as BASE_NAME
That will match everything up to the first mm or cm. Note that you
don't have to worry about the second set of brackets returning anything
as the regexp version of substring only returns what is matched by the
first parenthesised subexpression.
--
erik jones <erik@myemma.com>
software development
emma(r)