Thread: regexp_matches problem
(postgresql 8.3.7, linux centos)
I made the following regexp_matches
SELECT regexp_matches(
(
SELECT content
FROM page
WHERE idpage = 2
)
,','<img\\s+((title="[^"]+")|(alt="[^"]+")|([^>]))*>'
, 'ig'
) AS result
The result looks like: {"\"",NULL,NULL,"\""} in phpPgAdmin.
Which is not the right result. I want to see als the images and their alt, title and src.
Can anyone help with telling what I am doing wrong?
Willem
When I make the following simplified example:
SELECT regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">'
,'((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig')
My result are 2 rows:
"{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}"
"{"title=\"dit is een title tekst\"","title=\"dit is een title tekst\"","dit is een title tekst",NULL,NULL,NULL}"
I would like to have 1 row which contains both the records. Does anyone know how I can solve this?
Willem
Van: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] Namens WP Perquin
Verzonden: dinsdag 12 mei 2009 13:44
Aan: pgsql-general@postgresql.org
Onderwerp: [GENERAL] regexp_matches problem
(postgresql 8.3.7, linux centos)
I made the following regexp_matches
SELECT regexp_matches(
(
SELECT content
FROM page
WHERE idpage = 2
)
,','<img\\s+((title="[^"]+")|(alt="[^"]+")|([^>]))*>'
, 'ig'
) AS result
The result looks like: {"\"",NULL,NULL,"\""} in phpPgAdmin.
Which is not the right result. I want to see als the images and their alt, title and src.
Can anyone help with telling what I am doing wrong?
Willem
WP Perquin wrote: > When I make the following simplified example: > > SELECT regexp_matches('<img src="wwww" title="dit is een > title tekst" class="class12">' > > ,'((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig') > > My result are 2 rows: > > "{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}" > > "{"title=\"dit is een title tekst\"","title=\"dit is een > title tekst\"","dit is een title tekst",NULL,NULL,NULL}" > > I would like to have 1 row which contains both the records. > Does anyone know how I can solve this? Do you really want all those NULLs? Is that what you want: SELECT match[1] FROM regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">', '(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")', 'ig') AS match; This query returns two rows. When you write that you want to have one row that contains both records, do you mean: a) one string that is the concatenation of both strings or b) one row that is a single array with two string elements Whatever it is you want, you will probably need to write a little aggregate function that does that for you, something like CREATE FUNCTION text_cats(state text[], nextv text) RETURNS text[] IMMUTABLE STRICT LANGUAGE sql AS 'SELECT $1 || $2'; CREATE AGGREGATE text_cat(text) ( SFUNC = text_cats, STYPE = text[], INITCOND = '{}' ); for variant b). Then you can SELECT text_cat(match[1]) FROM regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">', '(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")', 'ig') AS match; Yours, Laurenz Albe
Who many thanks Laurenz. You gave me the right direction. The following is what I made up and it does exactly what I want it to do: --collect the specific string data for each result row SELECT ( SELECT match[1] FROM regexp_matches(secondresult.imgstring,'title="([^"]+)"','ig') AS match ) AS titletekst ,( SELECT match[1] FROM regexp_matches(secondresult.imgstring,'alt="([^"]+)"','ig') AS match ) AS alttekst ,( SELECT match[1] FROM regexp_matches(secondresult.imgstring,'src="([^"]+)"','ig') AS match ) AS imgsrc FROM ( --collect the image string of every found image SELECT firstresult[1] AS imgstring FROM regexp_matches('<img src="wwww" title="titletext " class="class12">' ,'<img[[:space:]]+([^>]*)>' ,'ig' ) AS firstresult ) as secondresult This is it. Very simple and very efficient. It took me ages to find out. Hopefully it helps other people. Willem -----Oorspronkelijk bericht----- Van: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] Namens Albe Laurenz Verzonden: woensdag 13 mei 2009 10:50 Aan: WP Perquin *EXTERN*; pgsql-general@postgresql.org Onderwerp: Re: [GENERAL] regexp_matches problem WP Perquin wrote: > When I make the following simplified example: > > SELECT regexp_matches('<img src="wwww" title="dit is een > title tekst" class="class12">' > > ,'((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig') > > My result are 2 rows: > > "{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}" > > "{"title=\"dit is een title tekst\"","title=\"dit is een > title tekst\"","dit is een title tekst",NULL,NULL,NULL}" > > I would like to have 1 row which contains both the records. > Does anyone know how I can solve this? Do you really want all those NULLs? Is that what you want: SELECT match[1] FROM regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">', '(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")', 'ig') AS match; This query returns two rows. When you write that you want to have one row that contains both records, do you mean: a) one string that is the concatenation of both strings or b) one row that is a single array with two string elements Whatever it is you want, you will probably need to write a little aggregate function that does that for you, something like CREATE FUNCTION text_cats(state text[], nextv text) RETURNS text[] IMMUTABLE STRICT LANGUAGE sql AS 'SELECT $1 || $2'; CREATE AGGREGATE text_cat(text) ( SFUNC = text_cats, STYPE = text[], INITCOND = '{}' ); for variant b). Then you can SELECT text_cat(match[1]) FROM regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">', '(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")', 'ig') AS match; Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general