RE: Regex Replace with 2 conditions - Mailing list pgsql-general

From Denisa Cirstescu
Subject RE: Regex Replace with 2 conditions
Date
Msg-id CY1PR12MB00251473B9810794A05579A1E6FE0@CY1PR12MB0025.namprd12.prod.outlook.com
Whole thread Raw
In response to Re: Regex Replace with 2 conditions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Regex Replace with 2 conditions  (Francisco Olarte <folarte@peoplecall.com>)
List pgsql-general
Francisco,

I've tried the version that you are proposing before posting this question, but it is not good as it is removing
charactersthat have ASCII code greater than 255 and those are characters that I need to keep, such as "ă". 

    SELECT regexp_replace(p_string, E'[^A-Za-z0-9%_]', '', 'g'));

This is the request that I have: write a function that eliminates all ASCII characters from 1-255 that are not A-Z,
a-z,0-9, and special characters % and _ 

Tom,

I have tried what you suggested with the lookahead and it is working.
It is exactly what I needed. The final version of the function is:

    CREATE OR REPLACE FUNCTION testFunction(p_string CHARACTER VARYING) RETURNS VARCHAR AS $$
        SELECT regexp_replace(p_string, E'(?=[' || CHR(1) || '-' || CHR(255) || '])[^A-Za-z0-9%_]', '', 'g');
    $$ LANGUAGE sql IMMUTABLE;


Thanks a lot,
Denisa Cîrstescu


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, February 5, 2018 4:43 PM
To: Denisa Cirstescu <Denisa.Cirstescu@tangoe.com>
Cc: pgsql-general@postgresql.org
Subject: Re: Regex Replace with 2 conditions

Denisa Cirstescu <Denisa.Cirstescu@tangoe.com> writes:
> Is there a way to specify 2 conditions in regexp_replace?
> I need an SQL function that eliminates all ASCII characters from 1-255 that are not A-Z, a-z, 0-9, and special
characters% and _  so something like: 
> SELECT regexp_replace(p_string, E'[' || CHR(1) || '-' || CHR(255) ||
> '&&[^A-Za-z0-9%_]]', '', 'g')); But this syntax is not really working.

Nope, because there's no && operator in regexes.

But I think you could get what you want by using lookahead or lookbehind to combine additional condition(s) with a
basiccharacter-class pattern. 
Something like

    (?=[\001-\377])[^A-Za-z0-9%_]

            regards, tom lane


pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Regex Replace with 2 conditions
Next
From: Francisco Olarte
Date:
Subject: Re: Regex Replace with 2 conditions