substring syntax with regexp

Started by joseph speiglealmost 22 years ago7 messagesgeneral
Jump to latest
#1joseph speigle
joe.speigle@jklh.us

hi,

Does anybody know offhand what is the correct way to use substr to extract the domain name from a client_referer column as logged by mod_pgsqllog (httpd module), by correcting the following:

the file 'hostname.sql' is pl/pgsql

main=> \e hostname.sql
ERROR: syntax error at or near "http" at character 290
LINE 13: newuri = substr(tempuri from 'http://[^/]*/.*');
^
I don't know, the docs are:

The substring function with two parameters, substring(string from pattern), provides extraction of a substring that matches a POSIX regular expression pattern.

(http://www.postgresql.org/docs/current/static/functions-matching.html)

the full 'hostname.sql' as is now is:

CREATE or replace FUNCTION hostname() RETURNS setof logpgsql.stats_type as
'
declare
row stats_type%ROWTYPE;
rec record;
newurl varchar(100);
tempurl varchar(100);
begin
for rec in SELECT * from stats
loop
row.c = rec.c;
tempurl = rec.url;
newuri = substr(tempuri from 'http://[^/]*/.*');
row.uri = newurl;
row.r = rec.r;
return next row;
end loop;
return next row;
return;
end
'
LANGUAGE 'plpgsql';

#2Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: joseph speigle (#1)
Re: substring syntax with regexp

On Wed, Jun 30, 2004 at 08:45:18AM -0500, joseph speigle wrote:

hi,

Does anybody know offhand what is the correct way to use substr to
extract the domain name from a client_referer column as logged by
mod_pgsqllog (httpd module), by correcting the following:

Nah, your problem is before the syntax of the substr function. You have
to quote your single quotes in the function text. So the relevant like
would look like

newuri = substr(tempuri from ''http://[^/]*/.*'');

I didn't look at the rest of the function ...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Uno combate cuando es necesario... �no cuando est� de humor!
El humor es para el ganado, o para hacer el amor, o para tocar el
baliset. No para combatir." (Gurney Halleck)

#3Ian Lawrence Barwick
barwick@gmail.com
In reply to: joseph speigle (#1)
Re: substring syntax with regexp

On Wed, 30 Jun 2004 08:45:18 -0500, joseph speigle <joe.speigle@jklh.us> wrote:

hi,

Does anybody know offhand what is the correct way to use substr to extract the domain name from a client_referer column as logged by mod_pgsqllog (httpd module), by correcting the following:

the file 'hostname.sql' is pl/pgsql

main=> \e hostname.sql
ERROR: syntax error at or near "http" at character 290
LINE 13: newuri = substr(tempuri from 'http://[^/]*/.*&#39;);

You have several immediate problems with this line:
- your regex should be double-quoted;
- the relevant function is "substring", not "substr";
- the regex pattern you want is probably more like :
'http://([^/]*)/&#39;, e.g.

test=> select substring('http://www.example.com/dir/file.html&#39; from
'http://([^/]*)&#39;);
substring
-------------
www.example.com
(1 row)

HTH

Ian Barwick

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: joseph speigle (#1)
Re: substring syntax with regexp

joseph speigle <joe.speigle@jklh.us> writes:

CREATE or replace FUNCTION hostname() RETURNS setof logpgsql.stats_type as
'
declare
...
newuri = substr(tempuri from 'http://[^/]*/.*&#39;);

You forgot about doubling quotes within a function body :-(

regards, tom lane

#5Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: joseph speigle (#1)
Re: substring syntax with regexp

On Wed, 30 Jun 2004, joseph speigle wrote:

main=> \e hostname.sql
ERROR: syntax error at or near "http" at character 290
LINE 13: newuri = substr(tempuri from 'http://[^/]*/.*&#39;);

You probably need to double those quotes ('') since it's inside
a quoted string (the function body).

#6Richard Huxton
dev@archonet.com
In reply to: joseph speigle (#1)
Re: substring syntax with regexp

joseph speigle wrote:

hi,

Does anybody know offhand what is the correct way to use substr to
extract the domain name from a client_referer column as logged by
mod_pgsqllog (httpd module), by correcting the following:

You have a quoting problem

CREATE or replace FUNCTION hostname() RETURNS setof
logpgsql.stats_type as ' declare row stats_type%ROWTYPE; rec record;
newurl varchar(100); tempurl varchar(100); begin for rec in SELECT *
from stats loop row.c = rec.c; tempurl = rec.url; newuri =
substr(tempuri from 'http://[^/]*/.*&#39;);

OK, you probably want to use := for assignment. Also, you're already
inside one set of quotes, so you'll need to escape the quotes for your
string.

newuri := substr(tempuri from ''http://[^/]*/.*&#39;&#39;);
or
newuri := substr(tempuri from \'http://[^/]*/.*\&#39;);

--
Richard Huxton
Archonet Ltd

#7joseph speigle
joe.speigle@jklh.us
In reply to: joseph speigle (#1)
Re: substring syntax with regexp

hi,
thanks everybody. It was a combination of:

changing function name from substr to substring
double-quoting inside the function
using select into inside the plpgsql function to retrieve the substring

-- retrieve hostname from the client_referrer field
CREATE OR REPLACE FUNCTION hostname() RETURNS setof modlog.stats_type AS
'
DECLARE
row stats_type%ROWTYPE;
rec record;
newclient_referrer varchar(100);
BEGIN
FOR rec IN SELECT * FROM stats ORDER BY uri_hits DESC
LOOP
row.uri_hits = rec.uri_hits;
SELECT INTO row.client_referrer SUBSTRING(rec.client_referrer FROM ''http://([^/]*).*&#39;&#39;);
IF row.client_referrer IS NULL THEN
-- /cgi-bin/404.pl etc.
row.client_referrer := ''localhost'';
END IF;
row.uri = NULL;
RETURN NEXT row;
END LOOP;
RETURN NEXT ROW;
RETURN;
END
'
LANGUAGE 'plpgsql';
select * from hostname();