Re: Multi-line text fields - Mailing list pgsql-sql

From Craig Ringer
Subject Re: Multi-line text fields
Date
Msg-id 48D8686A.50905@postnewspapers.com.au
Whole thread Raw
In response to Multi-line text fields  (Mike Toews <mwtoews@sfu.ca>)
Responses Resp.: Multi-line text fields  ("Osvaldo Kussama" <osvaldo.kussama@gmail.com>)
List pgsql-sql
Mike Toews wrote:
> Hi all,
> 
> I have some records that have some multiple lines within a single text 
> field. (On top of that, I think I have mixed DOS/UNIX line endings too). 
> I'm looking for two functions which can operate on a single field:
> 
>   1. number of lines

A few different ways, none of which are particularly pretty:

SELECT length(regexp_replace(inputstr, E'[^\\n]', '', 'g'));

SELECT count(1) from regexp_split_to_table(inputstr, E'\\n');

CREATE LANGUAGE plpythonu;
CREATE OR REPLACE FUNCTION nlines(s VARCHAR) RETURNS INTEGER AS $$
return len(s) - len(s.replace('\n',''))
$$ LANGUAGE 'plpythonu' IMMUTABLE;
SELECT nlines(inputstr);

Note that using \n should be OK so long as you don't have any files with 
Mac line endings. It'll be fine for both UNIX and DOS line endings.

If you really need it to be fast, consider writing a small C function to 
do the job. I'd consider an SQL-visible prototype like:

countchar(inputstr VARCHAR, findchar CHAR) RETURNS INTEGER

You might even want to send a patch in, just in case the dev team want 
to include it as a utility function.

>   2. a way to select the first line or to trim to the first line only
>      (the normal trim function doesn't appear to do this)

One way, again probably not the fastest:

SELECT (regexp_split_to_array(inputstr, E'\\n'))[1]

Note the extra set of parentheses. You might also want to trim() off any 
trailing \r in case of DOS line endings.

A little C function that copied the input only up to the first newline 
would instead probably be the fastest. It'd also let you easily strip 
the trailing \r if any was present.

--
Craig Ringer


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Special grouping on sorted data.
Next
From: Craig Ringer
Date:
Subject: Re: Special grouping on sorted data.