Re: Extracting hostname from URI column - Mailing list pgsql-sql

From Paul Lambert
Subject Re: Extracting hostname from URI column
Date
Msg-id 46E7315C.3060006@autoledgers.com.au
Whole thread Raw
In response to Extracting hostname from URI column  (ogjunk-pgjedan@yahoo.com)
List pgsql-sql
ogjunk-pgjedan@yahoo.com wrote:
> Hi,
> 
> I'm trying to use substr() and position() functions to extract the full host name (and later a domain) from a column
thatholds URLs.
 
> This is what I'm trying, but it clearly doesn't do the job.
> 
> => select substr(href, position('://' in href)+3, position('://' in href)+3+position('/' in href)), href from url
whereid <10;
 
>      substr     |                           href
> ----------------+----------------------------------------------------------
>  texturizer.net | http://texturizer.net/firebird/extensions/
>  texturizer.net | http://texturizer.net/firebird/themes/
>  forums.mozilla | http://forums.mozillazine.org/index.php?c=4
>  www.mozillazin | http://www.mozillazine.org/
>  devedge.netsca | http://devedge.netscape.com/viewsource/2002/bookmarks/
>  www.google.com | http://www.google.com/search?&q=%s
>  groups.google. | http://groups.google.com/groups?scoring=d&q=%s
>  www.google.com | http://www.google.com/search?q=%s&btnI=I'm+Feeling+Lucky
>  dictionary.ref | http://dictionary.reference.com/search?q=%s
> 
> The 3rd param to the substr function is clearly wrong.  Is it even doable without writing a procedure?
> 
> Finally, is this the fastest way to get this data, or is there  regex-based function that might be faster?
> 
> Thanks,
> Otis
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 
> 

Try this:

test=# select substr(href, position('://' in href)+3, position('/' in 
substr(
href,position('://' in href)+3))-1), href from url;          substr          |                           href

--------------------------+-----------------------------------------------------
----- devedge.netscape.com     | 
http://devedge.netscape.com/viewsource/2002/bookmark
s/ texturizer.net           | http://texturizer.net/firebird/extensions/ texturizer.net           |
http://texturizer.net/firebird/themes/forums.mozillazine.org   | http://forums.mozillazine.org/index.php?c=4
www.mozillazine.org     | http://www.mozillazine.org/ devedge.netscape.com     | 
 
http://devedge.netscape.com/viewsource/2002/bookmark
s/ www.google.com           | http://www.google.com/search?&q=%s groups.google.com        |
http://groups.google.com/groups?scoring=d&q=%swww.google.com           | 
 
http://www.google.com/search?q=%s&btnI=I'm+Feeling+L
ucky dictionary.reference.com | http://dictionary.reference.com/search?q=%s
(10 rows)

-- 
Paul Lambert
Database Administrator
AutoLedgers


pgsql-sql by date:

Previous
From: chester c young
Date:
Subject: Re: Extracting hostname from URI column
Next
From: Tom Lane
Date:
Subject: Re: select sum within transaction problem