regexp_matches problem
(postgresql 8.3.7, linux centos)
I made the following regexp_matches
SELECT regexp_matches(
(
SELECT content
FROM page
WHERE idpage = 2
)
,','<img\\s+((title="[^"]+")|(alt="[^"]+")|([^>]))*>'
, 'ig'
) AS result
The result looks like: {"\"",NULL,NULL,"\""} in phpPgAdmin.
Which is not the right result. I want to see als the images and their alt,
title and src.
Can anyone help with telling what I am doing wrong?
Willem
When I make the following simplified example:
SELECT regexp_matches('<img src="wwww" title="dit is een title tekst"
class="class12">'
,'((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig')
My result are 2 rows:
"{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}"
"{"title=\"dit is een title tekst\"","title=\"dit is een title tekst\"","dit
is een title tekst",NULL,NULL,NULL}"
I would like to have 1 row which contains both the records. Does anyone know
how I can solve this?
Willem
_____
Van: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] Namens WP Perquin
Verzonden: dinsdag 12 mei 2009 13:44
Aan: pgsql-general@postgresql.org
Onderwerp: [GENERAL] regexp_matches problem
(postgresql 8.3.7, linux centos)
I made the following regexp_matches
SELECT regexp_matches(
(
SELECT content
FROM page
WHERE idpage = 2
)
,','<img\\s+((title="[^"]+")|(alt="[^"]+")|([^>]))*>'
, 'ig'
) AS result
The result looks like: {"\"",NULL,NULL,"\""} in phpPgAdmin.
Which is not the right result. I want to see als the images and their alt,
title and src.
Can anyone help with telling what I am doing wrong?
Willem
WP Perquin wrote:
When I make the following simplified example:
SELECT regexp_matches('<img src="wwww" title="dit is een
title tekst" class="class12">','((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig')
My result are 2 rows:
"{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}"
"{"title=\"dit is een title tekst\"","title=\"dit is een
title tekst\"","dit is een title tekst",NULL,NULL,NULL}"I would like to have 1 row which contains both the records.
Does anyone know how I can solve this?
Do you really want all those NULLs?
Is that what you want:
SELECT match[1]
FROM regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">',
'(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")',
'ig') AS match;
This query returns two rows.
When you write that you want to have one row that contains both
records, do you mean:
a) one string that is the concatenation of both strings
or
b) one row that is a single array with two string elements
Whatever it is you want, you will probably need to write a
little aggregate function that does that for you, something like
CREATE FUNCTION text_cats(state text[], nextv text) RETURNS text[]
IMMUTABLE STRICT LANGUAGE sql
AS 'SELECT $1 || $2';
CREATE AGGREGATE text_cat(text) (
SFUNC = text_cats,
STYPE = text[],
INITCOND = '{}' );
for variant b).
Then you can
SELECT text_cat(match[1])
FROM regexp_matches('<img src="wwww" title="dit is een title tekst" class="class12">',
'(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")',
'ig') AS match;
Yours,
Laurenz Albe
Who many thanks Laurenz.
You gave me the right direction.
The following is what I made up and it does exactly what I want it to do:
--collect the specific string data for each result row
SELECT (
SELECT match[1] FROM
regexp_matches(secondresult.imgstring,'title="([^"]+)"','ig') AS match
) AS titletekst
,(
SELECT match[1] FROM
regexp_matches(secondresult.imgstring,'alt="([^"]+)"','ig') AS match
) AS alttekst
,(
SELECT match[1] FROM
regexp_matches(secondresult.imgstring,'src="([^"]+)"','ig') AS match
) AS imgsrc
FROM (
--collect the image string of every found image
SELECT firstresult[1] AS imgstring
FROM regexp_matches('<img src="wwww" title="titletext "
class="class12">'
,'<img[[:space:]]+([^>]*)>'
,'ig'
) AS firstresult
) as secondresult
This is it. Very simple and very efficient.
It took me ages to find out. Hopefully it helps other people.
Willem
-----Oorspronkelijk bericht-----
Van: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] Namens Albe Laurenz
Verzonden: woensdag 13 mei 2009 10:50
Aan: WP Perquin *EXTERN*; pgsql-general@postgresql.org
Onderwerp: Re: [GENERAL] regexp_matches problem
WP Perquin wrote:
When I make the following simplified example:
SELECT regexp_matches('<img src="wwww" title="dit is een
title tekst" class="class12">','((title\s*=\s*\"([^"]*)")+)|((src\s*=\s*\"([^"]*)")+)','ig')
My result are 2 rows:
"{NULL,NULL,NULL,"src=\"wwww\"","src=\"wwww\"",wwww}"
"{"title=\"dit is een title tekst\"","title=\"dit is een
title tekst\"","dit is een title tekst",NULL,NULL,NULL}"I would like to have 1 row which contains both the records.
Does anyone know how I can solve this?
Do you really want all those NULLs?
Is that what you want:
SELECT match[1]
FROM regexp_matches('<img src="wwww" title="dit is een title tekst"
class="class12">',
'(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")',
'ig') AS match;
This query returns two rows.
When you write that you want to have one row that contains both
records, do you mean:
a) one string that is the concatenation of both strings
or
b) one row that is a single array with two string elements
Whatever it is you want, you will probably need to write a
little aggregate function that does that for you, something like
CREATE FUNCTION text_cats(state text[], nextv text) RETURNS text[]
IMMUTABLE STRICT LANGUAGE sql
AS 'SELECT $1 || $2';
CREATE AGGREGATE text_cat(text) (
SFUNC = text_cats,
STYPE = text[],
INITCOND = '{}' );
for variant b).
Then you can
SELECT text_cat(match[1])
FROM regexp_matches('<img src="wwww" title="dit is een title tekst"
class="class12">',
'(title\s*=\s*\"[^"]*"|src\s*=\s*\"[^"]*")',
'ig') AS match;
Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general