Thread: regexp_replace and search/replace values stored in table

regexp_replace and search/replace values stored in table

From
Leif Biberg Kristensen
Date:
I've got a system for entering and storing a lot of standard hyperlinks in a 
compact form, and then expand them at run time like this:

CREATE OR REPLACE FUNCTION _my_expand(TEXT) RETURNS TEXT AS $$
-- private func, expand various compact links
DECLARE   str TEXT = $1;
BEGIN   -- Scanned church books [kb=book reference|image reference|link text]   str := REGEXP_REPLACE(str,
E'\\[kb=(.+?)\\|(.+?)\\|(.+?)\\]',          E'<a 
 
href="//www.arkivverket.no/URN:kb_read?idx_kildeid=\\1&uid=ny&idx_side=\\2" 
title="Lenke til bilde av kirkebokside">\\3</a>', 'g');   -- Scanned probate registers [sk=protocol|image
reference|linktext]   str := REGEXP_REPLACE(str, E'\\[sk=(.+?)\\|(.+?)\\|(.+?)\\]',           E'<a
href="//www.arkivverket.no/URN:sk_read/\\1/\\2"title="Lenke 
 
til bilde av skifteprotokollside">\\3</a>', 'g');   -- Scanned deed ("pantebok") registers [sk=protocol|image
reference|link
 
text]   str := REGEXP_REPLACE(str, E'\\[tl=(.+?)\\|(.+?)\\|(.+?)\\]',           E'<a 
href="//www.arkivverket.no/URN:tl_read?idx_id=\\1&uid=ny&idx_side=\\2" 
title="Lenke til bilde av pantebokside">\\3</a>', 'g');   RETURN str;
END
$$ LANGUAGE plpgsql STABLE;

According to the slogan "minimize code, maximize data" I feel that these 
strings should be stored in a table:

CREATE TABLE short_links (   link_type CHAR(2) PRIMARY KEY,   short_link TEXT,   long_link TEXT,   description TEXT
);

It appears like I have to double the number of backslashes when I enter the 
data:

INSERT INTO short_links (link_type, short_link, long_link, description) VALUES
('sk', E'\\\\[sk=(.+?)\\\\|(.+?)\\\\|(.+?)\\\\]',
E'<a href="//www.arkivverket.no/URN:sk_read/\\\\1/\\\\2" title="Lenke til 
bilde av skifteprotokollside">\\\\3</a>',
'Scanned probate registers [sk=protocol|image reference|link text]');

pgslekt=> select * from short_links;
link_type |           short_link           |                                                 
long_link                                                 |                            
description

-----------+--------------------------------+-----------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------sk
      | \\[sk=(.+?)\\|(.+?)\\|(.+?)\\] | <a 
 
href="//www.arkivverket.no/URN:sk_read/\\1/\\2" title="Lenke til bilde av 
skifteprotokollside">\\3</a> | Scanned probate registers [sk=protocol|image 
reference|link text]
(1 row)

So far, so good. But when I try to do the actual expansion, I'm stumped.

pgslekt=> select regexp_replace((select source_text from sources where 
source_id=23091), (select quote_literal(short_link) from short_links where 
link_type = 'sk'), (select quote_literal(long_link) from short_links where 
link_type = 'sk'), 'g');                                        regexp_replace
------------------------------------------------------------------------------------------------[sk=25658|67|side 66a].
Vabakkenunder Klyve vestre i Solum 07.07.1784: 
 
[p=6947|Isach Jonsen].
(1 row)

What am I missing?

regards,
-- 
Leif Biberg Kristensen
http://solumslekt.org/


Re: regexp_replace and search/replace values stored in table

From
Tim Landscheidt
Date:
Leif Biberg Kristensen <leif@solumslekt.org> wrote:

> [...]
> So far, so good. But when I try to do the actual expansion, I'm stumped.

> pgslekt=> select regexp_replace((select source_text from sources where
> source_id=23091), (select quote_literal(short_link) from short_links where
> link_type = 'sk'), (select quote_literal(long_link) from short_links where
> link_type = 'sk'), 'g');
>                                          regexp_replace
> ------------------------------------------------------------------------------------------------
>  [sk=25658|67|side 66a]. Vabakken under Klyve vestre i Solum 07.07.1784:
> [p=6947|Isach Jonsen].
> (1 row)

> What am I missing?

For starters, omit the call to quote_literal ().

Tim



Re: regexp_replace and search/replace values stored in table

From
Leif Biberg Kristensen
Date:
On Tuesday 27. April 2010 15.04.23 Tim Landscheidt wrote:
> Leif Biberg Kristensen <leif@solumslekt.org> wrote:
> 
> > [...]
> > So far, so good. But when I try to do the actual expansion, I'm stumped.
> 
> > pgslekt=> select regexp_replace((select source_text from sources where
> > source_id=23091), (select quote_literal(short_link) from short_links where
> > link_type = 'sk'), (select quote_literal(long_link) from short_links where
> > link_type = 'sk'), 'g');
> >                                          regexp_replace
> > 
------------------------------------------------------------------------------------------------
> >  [sk=25658|67|side 66a]. Vabakken under Klyve vestre i Solum 07.07.1784:
> > [p=6947|Isach Jonsen].
> > (1 row)
> 
> > What am I missing?
> 
> For starters, omit the call to quote_literal ().
> 
> Tim

Tim,
that was actually the first thing i tried, but I omitted it from the post:

pgslekt=> select regexp_replace((select source_text from sources where 
source_id=23091), (select short_link from short_links where link_type = 'sk'), 
(select long_link from short_links where link_type = 'sk'), 'g');
regexp_replace
------------------------------------------------------------------------------------------------[sk=25658|67|side 66a].
Vabakkenunder Klyve vestre i Solum 07.07.1784: 
 
[p=6947|Isach Jonsen].
(1 row)

It doesn't work either.

regards,
-- 
Leif Biberg Kristensen
http://solumslekt.org/


Re: regexp_replace and search/replace values stored in table

From
Leif Biberg Kristensen
Date:
On Tuesday 27. April 2010 13.43.48 Leif Biberg Kristensen wrote:
> CREATE TABLE short_links (
>     link_type CHAR(2) PRIMARY KEY,
>     short_link TEXT,
>     long_link TEXT,
>     description TEXT
> );
> 
> It appears like I have to double the number of backslashes when I enter the 
> data:
> 
> INSERT INTO short_links (link_type, short_link, long_link, description) 
VALUES
> ('sk', E'\\\\[sk=(.+?)\\\\|(.+?)\\\\|(.+?)\\\\]',
> E'<a href="//www.arkivverket.no/URN:sk_read/\\\\1/\\\\2" title="Lenke til 
> bilde av skifteprotokollside">\\\\3</a>',
> 'Scanned probate registers [sk=protocol|image reference|link text]');

I found the solution. I was totally wrong about the 4 backslashes.

pgslekt=> delete from short_links;
DELETE 1
pgslekt=> INSERT INTO short_links (link_type, short_link, long_link, 
description) VALUES
pgslekt-> ('sk', E'\\[sk=(.+?)\\|(.+?)\\|(.+?)\\]',
pgslekt(> E'<a href="//www.arkivverket.no/URN:sk_read/\\1/\\2" title="Lenke 
til bilde av skifteprotokollside">\\3</a>',
pgslekt(> 'Scanned probate registers [sk=protocol|image reference|link 
text]');
INSERT 0 1
pgslekt=> select regexp_replace((select source_text from sources where 
source_id=23091), (select short_link from  long_link from short_links where 
link_type = 'sk'), 'g');
                                                                                    regexp_replace              

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<a
href="//www.arkivverket.no/URN:sk_read/25658/67"title="Lenke til bilde av 
 
skifteprotokollside">side 66a</a>. Vabakken under Klyve vestre i Solum 
07.07.1784: [p=6947|Isach Jonsen].
(1 row)

Beautiful. Now I'm going replace this big ugly function with a loop reading 
values from a table. I may even write an interface for managing shortlinks :)

regards,
-- 
Leif Biberg Kristensen
http://solumslekt.org/


Re: regexp_replace and search/replace values stored in table

From
Leif Biberg Kristensen
Date:
Followup. Replaced Big Ugly Function with:

CREATE OR REPLACE FUNCTION _my_expand(TEXT) RETURNS TEXT AS $$
-- private func, expand various compact links
DECLARE   str TEXT = $1;   links RECORD;

BEGIN   FOR links IN SELECT short_link, long_link FROM short_links LOOP       str := REGEXP_REPLACE(str,
links.short_link,links.long_link, 'g');   END LOOP;   RETURN str;
 
END
$$ LANGUAGE plpgsql IMMUTABLE;

By the way, those who haven't read this gem should probably do so: 
<http://database-programmer.blogspot.com/2008/05/minimize-code-maximize-
data.html>

regards,
-- 
Leif Biberg Kristensen
http://solumslekt.org/


Re: regexp_replace and search/replace values stored in table

From
Leif Biberg Kristensen
Date:
Followup II: I've written a blog post on the subject at 
<http://solumslekt.org/blog/?p=151>.
regards,
-- 
Leif Biberg Kristensen
http://solumslekt.org/


Re: regexp_replace and search/replace values stored in table

From
"Sofer, Yuval"
Date:
Hi

Postgres crashes with -

PG "FATAL:  could not reattach to shared memory (key=5432001, addr=02100000): Invalid argument.

The version is 8.2.4, the platform is win32

Does someone know the reason/workaround ?

Thanks,
Yuval Sofer
BMC Software
CTM&D Business Unit
DBA Team
972-52-4286-282
yuval_sofer@bmc.com



-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Leif Biberg Kristensen
Sent: Saturday, May 01, 2010 8:29 PM
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] regexp_replace and search/replace values stored in table

Followup II: I've written a blog post on the subject at
<http://solumslekt.org/blog/?p=151>.
regards,
--
Leif Biberg Kristensen
http://solumslekt.org/

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql