Thread: String manipulation

String manipulation

From
"Chad Thompson"
Date:
I cannot seem to put this together properly.

I have a column path that has string data I want parsed out.

select path from paths

path
-------------------------------------
Canon:Canon Toner
Data Products:Data Products Toner
Epson:Epson Inkjet Cartridges
Tektronix by Xerox:Tektronix Ink Sticks


what i would like is something like this

Toner
Products Toner
Inkjet Cartridges
Ink Sticks

That is.... after the colon, find the first space and give the rest of the
string.

TIA for any ideas
Chad


Re: String manipulation

From
"David Olbersen"
Date:
Chad,

I have an *UGLY* solution that might give you or others some ideas...

SELECT
  substring(
    split_part( path, ':', 2 )
    from
    position( ' ' in split_part( path, ':', 2 ) ) + 1
  )
FROM paths;

--------------------------
David Olbersen
iGuard Engineer
St. Bernard Software
11415 West Bernardo Court
San Diego, CA 92127
1-858-676-2277 x2152


> -----Original Message-----
> From: Chad Thompson [mailto:chad@weblinkservices.com]
> Sent: Monday, July 28, 2003 1:56 PM
> To: pgsql-novice
> Subject: [NOVICE] String manipulation
>
>
> I cannot seem to put this together properly.
>
> I have a column path that has string data I want parsed out.
>
> select path from paths
>
> path
> -------------------------------------
> Canon:Canon Toner
> Data Products:Data Products Toner
> Epson:Epson Inkjet Cartridges
> Tektronix by Xerox:Tektronix Ink Sticks
>
>
> what i would like is something like this
>
> Toner
> Products Toner
> Inkjet Cartridges
> Ink Sticks
>
> That is.... after the colon, find the first space and give
> the rest of the
> string.
>
> TIA for any ideas
> Chad
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>

Re: String manipulation

From
Joe Conway
Date:
David Olbersen wrote:
> I have an *UGLY* solution that might give you or others some ideas...
>
> SELECT
>   substring(
>     split_part( path, ':', 2 )
>     from
>     position( ' ' in split_part( path, ':', 2 ) ) + 1
>   )
> FROM paths;
>

That solution was pretty much what I was thinking. An alternative:

SELECT substring(path from position(' ' in path) + 1)
FROM (SELECT split_part( path, ':', 2 ) AS path FROM paths) AS ss;

Joe


Re: String manipulation

From
"David Olbersen"
Date:
Joe,

> SELECT substring(path from position(' ' in path) + 1)
> FROM (SELECT split_part( path, ':', 2 ) AS path FROM paths) AS ss;

That's cool enough except for the sub-select which would really slow down the query on a million+ row table. My
solutionis just reformatting data and doesn't add much if any over head to "SELECT path FROM paths". 

...but that's all speculation, am I right? I'd like to revise my thinking if I'm wrong.

--------------------------
David Olbersen
iGuard Engineer
St. Bernard Software
11415 West Bernardo Court
San Diego, CA 92127
1-858-676-2277 x2152


> -----Original Message-----
> From: Joe Conway [mailto:mail@joeconway.com]
> Sent: Monday, July 28, 2003 3:55 PM
> To: David Olbersen
> Cc: pgsql-novice
> Subject: Re: [NOVICE] String manipulation
>
>
> David Olbersen wrote:
> > I have an *UGLY* solution that might give you or others
> some ideas...
> >
> > SELECT
> >   substring(
> >     split_part( path, ':', 2 )
> >     from
> >     position( ' ' in split_part( path, ':', 2 ) ) + 1
> >   )
> > FROM paths;
> >
>
> That solution was pretty much what I was thinking. An alternative:
>
> SELECT substring(path from position(' ' in path) + 1)
> FROM (SELECT split_part( path, ':', 2 ) AS path FROM paths) AS ss;
>
> Joe
>
>

Re: String manipulation

From
Joe Conway
Date:
David Olbersen wrote:
>> SELECT substring(path from position(' ' in path) + 1) FROM (SELECT
>> split_part( path, ':', 2 ) AS path FROM paths) AS ss;
>
> That's cool enough except for the sub-select which would really slow
> down the query on a million+ row table. My solution is just
> reformatting data and doesn't add much if any over head to "SELECT
> path FROM paths".
>
> ...but that's all speculation, am I right? I'd like to revise my
> thinking if I'm wrong.
>

I could be wrong but I was thinking it would be more efficient to only
calculate the function once. I get these explain analyze results:

regression=# explain analyze SELECT substring(path from position(' ' in
path) + 1) FROM (SELECT split_part( path, ':', 2 ) AS path FROM paths)
AS ss;
QUERY PLAN
-----------------------------------------------------------------------
  Seq Scan on paths  (cost=0.00..1.09 rows=4 width=33) (actual
time=0.06..0.09 rows=4 loops=1)
  Total runtime: 0.19 msec
(2 rows)

regression=# explain analyze SELECT substring(split_part( path, ':', 2 )
from position( ' ' in split_part( path, ':', 2 ) ) + 1) FROM paths;
QUERY PLAN
-------------------------------------------------------------------------
  Seq Scan on paths  (cost=0.00..1.09 rows=4 width=33) (actual
time=0.09..0.13 rows=4 loops=1)
  Total runtime: 0.22 msec
(2 rows)

In either case each row is hit once.

Joe


Re: String manipulation

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> SELECT substring(path from position(' ' in path) + 1) FROM (SELECT
> split_part( path, ':', 2 ) AS path FROM paths) AS ss;

> I could be wrong but I was thinking it would be more efficient to only
> calculate the function once.

Actually, the only thing you saved was writing out the split_part call
twice --- when the planner flattens the sub-select it expands each
reference to a sub-select output, so the split_part ends up getting
evaluated twice anyway.  But the notational savings can be useful.

            regards, tom lane