Re: Regex help again (sorry, I am bad at these) - Mailing list pgsql-general

From David G. Johnston
Subject Re: Regex help again (sorry, I am bad at these)
Date
Msg-id CAKFQuwZ_1PC-SabOzn8knMOowDEiSkaexsi+9TpG3kGie1DhyQ@mail.gmail.com
Whole thread Raw
In response to Regex help again (sorry, I am bad at these)  (Christopher Molnar <cmolnar@ourworldservices.com>)
List pgsql-general
On Mon, Dec 28, 2015 at 12:10 PM, Christopher Molnar <cmolnar@ourworldservices.com> wrote:

Given this...

'<p>Complete the attached lab and submit via dropbox</p>\r<p><a href="https://owncloud.porterchester.edu/HVACR/PCI_GasHeat/GasElectrical/HVACR1114_LAB_13A.pdf" title="Lab 13A">Lab 13A<\a>'


​I have no clue how the following gives you any matches...​
specifically the presence of the "$" after the title= causes the entire pattern to always fail since that isn't the end of the string.

 update pcilms_assign set intro=regexp_replace(intro, '/([^/]*)\" title=$', '&files=\1') where intro like '%https://owncloud.porterchester.edu%' and course=18 and id=55413;

and the result puts the &file= in the wrong place (at the end of the whole string).

​The basic problem is that entirety of the content that your pattern matches ​is replaced with the totality of the replacement expression.  Since you are matching the literal "title=" you have to somehow place that same literal in the result.  You can capture it and then use "\2" or you can place it literally like Félix shows.

Alternatively, don't capture it.  The way you match something without capturing it is by using what is termed a "zero-width" expression or a "look-around".  In this case you want to "look-ahead" which is expressed thusly: (?=)

So...

'/([^/]*)(?=" title=)'

SELECT regexp_replace('<a href="https://www.www.www/path/FILE.pdf" title="FILE">', '/([^/]*)(?=" title=)', '&files=\1')

David J.

pgsql-general by date:

Previous
From: Jeff Janes
Date:
Subject: Re: grep -f keyword data query
Next
From: "David G. Johnston"
Date:
Subject: Re: Regex help again (sorry, I am bad at these)