Re: PostgreSQL SQL Tricks: faster urldecode - Mailing list pgsql-general

From Marc Mamin
Subject Re: PostgreSQL SQL Tricks: faster urldecode
Date
Msg-id B6F6FD62F2624C4C9916AC0175D56D880CE1C178@jenmbs01.ad.intershop.net
Whole thread Raw
In response to Re: PostgreSQL SQL Tricks: faster urldecode  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-general
> Von: Merlin Moncure [mmoncure@gmail.com]
> Gesendet: Freitag, 20. September 2013 17:43
>
> >  On Fri, Sep 20, 2013 at 10:26 AM, Marc Mamin <M.Mamin@intershop.de> wrote:
> > Hi,
> > here is a function which is about 8 x faster than the one described in the PostgreSQL SQL Tricks
> > ( http://postgres.cz/wiki/PostgreSQL_SQL_Tricks#Function_for_decoding_of_url_code )
> >
> > The idea is to handle each encoded/not_encoded parts in bulk rather than spliting on each character.
> >
> > urldecode_arr:
> > Seq Scan on lt_referrer  (actual time=1.966..17623.979 rows=65717 loops=1)
> >
> > urldecode:
> > Seq Scan on lt_referrer  (actual time=4.846..144445.292 rows=65717 loops=1)
>
> very nice.  Basically it comes down to this: all non-trivial regex
> replacements require decomposition of the string into an array because
> regexp_replace() is unable to do any kind of transformation on the
> string.  This is a crippling limitation relative to first-class regex
> languages like perl; postgres string translation functions are
> invisible to the regex engine.  I have no idea if this is fixable (I
> dimly recall Tom explaining why it might not be).
>
> merlin

yes, a possible(?) assistance for such problems would be a new variant of regexp_split_to_table
that would return two columns:
- the splitted parts (as currently)
- the separator matches (new)

Marc




pgsql-general by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Partitioning V schema
Next
From: Gregory Haase
Date:
Subject: Re: Partitioning V schema