Thread: Replacing Ordinal Suffixes
Hi list,
I'm stumped.
I am trying to use Regexp_Replace to replace ordinal suffixes in addresses (eg have '126th' want '126') for comparison purposes. So far no luck.
I have found that
SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?!/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------
300 nor 126 reet
regexp_replace
------------------
300 nor 126 reet
but
SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?=/D)(st|nd|rd|th)', '', 'g');
regexp_replace
------------------------
300 north 126th street
regexp_replace
------------------------
300 north 126th street
I'm a novice with regular expressions and google hasn't helped much.
Any suggestions?
Thanks,
George
Try this: SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(\d)(st|nd|rd|th)', '\1', 'g'); Note that matching a number is \d not /D: backslash, not forward slash, and lowercase d not uppercase. \d means a digit, \D means anything except a digit. Also, I don't think Postgres supports positive lookbehind expressions (which are actually (?<=foo), not (?!foo)), but you can get the same effect by capturing the number with (\d) and then outputting it again with the \1. Paul On Fri, Feb 28, 2014 at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote: > Hi list, > > I'm stumped. > > I am trying to use Regexp_Replace to replace ordinal suffixes in addresses > (eg have '126th' want '126') for comparison purposes. So far no luck. > > I have found that > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), > '(?!/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------ > 300 nor 126 reet > > but > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), > '(?=/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------------ > 300 north 126th street > > I'm a novice with regular expressions and google hasn't helped much. > > Any suggestions? > > Thanks, > George -- _________________________________ Pulchritudo splendor veritatis.
On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote: > Hi list, > > I'm stumped. > > I am trying to use Regexp_Replace to replace ordinal suffixes in addresses (eg have '126th' want '126') for comparisonpurposes. So far no luck. > > I have found that > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?!/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------ > 300 nor 126 reet > > but > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?=/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------------ > 300 north 126th street > > I'm a novice with regular expressions and google hasn't helped much. > > Any suggestions? Maybe this? select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); Cheers, Steve
This is a kick *ss forum. I must say. On Fri, 2014-02-28 at 14:17 -0800, Steve Atkins wrote: > On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote: > > > Hi list, > > > > I'm stumped. > > > > I am trying to use Regexp_Replace to replace ordinal suffixes in addresses (eg have '126th' want '126') for comparisonpurposes. So far no luck. > > > > I have found that > > > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?!/D)(st|nd|rd|th)', '', 'g'); > > regexp_replace > > ------------------ > > 300 nor 126 reet > > > > but > > > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(?=/D)(st|nd|rd|th)', '', 'g'); > > regexp_replace > > ------------------------ > > 300 north 126th street > > > > I'm a novice with regular expressions and google hasn't helped much. > > > > Any suggestions? > > Maybe this? > > select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); > > Cheers, > Steve > > >
From: Paul Jungwirth >Try this: >SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), '(\d)(st|nd|rd|th)', '\1', 'g'); Hi Paul, No luck... SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), E'(\d)(st|nd|rd|th)', E'\1', 'g'); regexp_replace ------------------------ 300 north 126th street (1 row) >Note that matching a number is \d not /D: backslash, not forward >slash, and lowercase d not uppercase. \d means a digit, \D means >anything except a digit. >Also, I don't think Postgres supports positive lookbehind expressions >(which are actually (?<=foo), not (?!foo)), but you can get the same >effect by capturing the number with (\d) and then outputting it again >with the \1. >Paul On Fri, Feb 28, 2014 at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote: > Hi list, > > I'm stumped. > > I am trying to use Regexp_Replace to replace ordinal suffixes in addresses > (eg have '126th' want '126') for comparison purposes. So far no luck. > > I have found that > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), > '(?!/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------ > 300 nor 126 reet > > but > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), > '(?=/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------------ > 300 north 126th street > > I'm a novice with regular expressions and google hasn't helped much. > > Any suggestions? > > Thanks, > George -- _________________________________ Pulchritudo splendor veritatis.
From: Steve Atkins
On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:
> Hi list,
>
> I'm stumped.
>
> I am trying to use Regexp_Replace to replace ordinal suffixes in addresses
> (eg have '126th' want '126') for comparison purposes. So far no luck.
>
> I have found that
>
> SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
> '(?!/D)(st|nd|rd|th)', '', 'g');
> regexp_replace
> ------------------
> 300 nor 126 reet
>
> but
>
> SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
> '(?=/D)(st|nd|rd|th)', '', 'g');
> regexp_replace
> ------------------------
> 300 north 126th street
>
> I'm a novice with regular expressions and google hasn't helped much.
>
> Any suggestions?
>Maybe this?
>select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
>'\1', 'gi');
Hi Steve,
Thanks, but no luck:
select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street
George
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote:
> Hi list,
>
> I'm stumped.
>
> I am trying to use Regexp_Replace to replace ordinal suffixes in addresses
> (eg have '126th' want '126') for comparison purposes. So far no luck.
>
> I have found that
>
> SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
> '(?!/D)(st|nd|rd|th)', '', 'g');
> regexp_replace
> ------------------
> 300 nor 126 reet
>
> but
>
> SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),
> '(?=/D)(st|nd|rd|th)', '', 'g');
> regexp_replace
> ------------------------
> 300 north 126th street
>
> I'm a novice with regular expressions and google hasn't helped much.
>
> Any suggestions?
>Maybe this?
>select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)',
>'\1', 'gi');
Hi Steve,
Thanks, but no luck:
select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)',
E'\1', 'gi');
regexp_replace
------------------------
300 North 126th Street
George
Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote: > From: Steve Atkins > > >Maybe this? > > >select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', > >'\1', 'gi'); > > Hi Steve, > > Thanks, but no luck: > > select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)', > E'\1', 'gi'); > regexp_replace > ------------------------ > 300 North 126th Street > > George Those E’s you added completely change the meaning. If you want to use E-style literals (and you probably don’t) you’ll need to double the backslashes in all the strings. Cheers, Steve
>----- Original Message ----- >From: Steve Atkins >On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote: > > >Maybe this? > > >select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', > >'\1', 'gi'); > > Hi Steve, > > Thanks, but no luck: > > select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)', > E'\1', 'gi'); > regexp_replace > ------------------------ > 300 North 126th Street > > George >Those E’s you added completely change the meaning. If you want to >use E-style literals (and you probably don’t) you’ll need to double the >backslashes in all the strings. Hi Steve, Without the E's: development=# select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); WARNING: nonstandard use of escape in a string literal LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|... ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. WARNING: nonstandard use of escape in a string literal LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'... ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. regexp_replace ------------------------ 300 North 126th Street (1 row) Frustrating... >Cheers, > Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
----- Original Message ----- From: Steve Atkins To: pgsql-general Sent: Friday, February 28, 2014 4:17 PM Subject: Re: [GENERAL] Replacing Ordinal Suffixes On Feb 28, 2014, at 2:04 PM, George Weaver <gweaver@shaw.ca> wrote: > Hi list, > > I'm stumped. > > I am trying to use Regexp_Replace to replace ordinal suffixes in addresses > (eg have '126th' want '126') for comparison purposes. So far no luck. > > I have found that > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), > '(?!/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------ > 300 nor 126 reet > > but > > SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), > '(?=/D)(st|nd|rd|th)', '', 'g'); > regexp_replace > ------------------------ > 300 north 126th street > > I'm a novice with regular expressions and google hasn't helped much. > > Any suggestions? >Maybe this? >select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', >'\1', 'gi'); Hi Steve, Thanks, but no luck: select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)', E'\1', 'gi'); regexp_replace ------------------------ 300 North 126th Street George Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
2014-03-01 8:16 GMT+09:00 George Weaver <gweaver@shaw.ca>: > >> ----- Original Message ----- From: Steve Atkins > > >> On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote: >> >> >Maybe this? >> >> >select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', >> >'\1', 'gi'); >> >> Hi Steve, >> >> Thanks, but no luck: >> >> select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)', >> E'\1', 'gi'); >> regexp_replace >> ------------------------ >> 300 North 126th Street >> >> George > > >> Those E’s you added completely change the meaning. If you want to >> use E-style literals (and you probably don’t) you’ll need to double the >> backslashes in all the strings. > > > Hi Steve, > > Without the E's: > > development=# select regexp_replace('300 North 126th Street', > '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); > WARNING: nonstandard use of escape in a string literal > LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|... > ^ > HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. > WARNING: nonstandard use of escape in a string literal > LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'... > ^ > HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. > > regexp_replace > ------------------------ > 300 North 126th Street > (1 row) > > Frustrating... Per Steve Atkin's note about double backslashes: postgres=> select regexp_replace('300 North 126th Street', E'(\\d+)(?:st|nd|rd|th)', E'\\1', 'gi'); regexp_replace ---------------------- 300 North 126 Street (1 row) Regards Ian Barwick
From: Ian Lawrence Barwick > >> ----- Original Message ----- From: Steve Atkins > > >> On Feb 28, 2014, at 2:43 PM, George Weaver <gweaver@shaw.ca> wrote: >> >> >Maybe this? >> >> >select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', >> >'\1', 'gi'); >> >> Hi Steve, >> >> Thanks, but no luck: >> >> select regexp_replace('300 North 126th Street', E'(\d+)(?:st|nd|rd|th)', >> E'\1', 'gi'); >> regexp_replace >> ------------------------ >> 300 North 126th Street >> >> George > > >> Those E’s you added completely change the meaning. If you want to >> use E-style literals (and you probably don’t) you’ll need to double the >> backslashes in all the strings. > > > Hi Steve, > > Without the E's: > > development=# select regexp_replace('300 North 126th Street', > '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); > WARNING: nonstandard use of escape in a string literal > LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|... > ^ > HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. > WARNING: nonstandard use of escape in a string literal > LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'... > ^ > HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. > > regexp_replace > ------------------------ > 300 North 126th Street > (1 row) > > Frustrating... Per Steve Atkin's note about double backslashes: postgres=> select regexp_replace('300 North 126th Street', E'(\\d+)(?:st|nd|rd|th)', E'\\1', 'gi'); regexp_replace ---------------------- 300 North 126 Street (1 row) Hi Ian, I just got that as well - awesome! Regards Ian Barwick
>>>>> "GW" == George Weaver <gweaver@shaw.ca> writes: GW> SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'), GW> E'(\d)(st|nd|rd|th)', E'\1', 'g'); GW> regexp_replace GW> ------------------------ GW> 300 north 126th street GW> (1 row) The E'' syntax eats your backslashes. For that version, try just: SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),'(\d)(st|nd|rd|th)', '\1', 'g'); -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
George Weaver wrote: > Hi Ian, > > I just got that as well - awesome! http://xkcd.com/1313/ -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi All, >From: James Cloos >The E'' syntax eats your backslashes. For that version, try just: >SELECT REGEXP_REPLACE(LOWER('300 North 126th Street'),'(\d)(st|nd|rd|th)', >'\1', 'g'); Actually, I found that the double backslashes are required whether the E is used or not: development=# select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); WARNING: nonstandard use of escape in a string literal LINE 1: select regexp_replace('300 North 126th Street', '(\d+)(?:st|... ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. WARNING: nonstandard use of escape in a string literal LINE 1: ...'300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'... ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. regexp_replace ------------------------ 300 North 126th Street (1 row) development=# select regexp_replace('300 North 126th Street', '(\\d+)(?:st|nd|rd|th)', '\\1', 'gi'); WARNING: nonstandard use of \\ in a string literal LINE 1: select regexp_replace('300 North 126th Street', '(\\d+)(?:st... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. WARNING: nonstandard use of \\ in a string literal LINE 1: ...300 North 126th Street', '(\\d+)(?:st|nd|rd|th)', '\\1', 'gi... ^ HINT: Use the escape string syntax for backslashes, e.g., E'\\'. regexp_replace ---------------------- 300 North 126 Street (1 row) development=# select regexp_replace('300 North 126th Street', E'(\\d+)(?:st|nd|rd|th)', E'\\1', 'gi'); regexp_replace ---------------------- 300 North 126 Street I appreciate the help of everyone who responded! Cheers, George
"George Weaver" <gweaver@shaw.ca> writes: > Actually, I found that the double backslashes are required whether the E is > used or not: You must be using a relatively old PG version then. Default behavior since around 9.1 has been that backslashes aren't special except in E'' strings. regards, tom lane
> Actually, I found that the double backslashes are required whether the E > is > used or not: >You must be using a relatively old PG version then. Default behavior >since around 9.1 has been that backslashes aren't special except >in E'' strings. Hmm. development=# select version(); version ------------------------------------------------------------ PostgreSQL 9.1.9, compiled by Visual C++ build 1500, 32-bit (1 row) >regards, tom lane
On Mar 1, 2014, at 11:45 AM, George Weaver <gweaver@shaw.ca> wrote: > >> Actually, I found that the double backslashes are required whether the E is >> used or not: > >> You must be using a relatively old PG version then. Default behavior >> since around 9.1 has been that backslashes aren't special except >> in E'' strings. > > Hmm. > > development=# select version(); > version > ------------------------------------------------------------ > PostgreSQL 9.1.9, compiled by Visual C++ build 1500, 32-bit > (1 row) steve=# select version(); version -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 9.1.4 on x86_64-apple-darwin12.0.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on AppleInc. build 5658) (LLVM build 2336.11.00), 64-bit (1 row) steve=# select regexp_replace('300 North 126th Street', '(\d+)(?:st|nd|rd|th)', '\1', 'gi'); regexp_replace ---------------------- 300 North 126 Street (1 row) I suspect you have standard_conforming_strings set to off (it defaults to on in 9.1), possibly for backwards compatibilityto support an app you’re using that’s not been updated, possibly accidentally. Cheers, Steve
----- Original Message ----- From: Steve Atkins <snip> >I suspect you have standard_conforming_strings set to off (it defaults to >on in 9.1), possibly for backwards compatibility to support an app you’re >using that’s not been updated, possibly accidentally. You're right - it was off (now On). Though for the life of me I can't say why... On the production installations I support its On. Thanks! >Cheers, > Steve