Re: patch adding new regexp functions - Mailing list pgsql-patches
From | Jeremy Drake |
---|---|
Subject | Re: patch adding new regexp functions |
Date | |
Msg-id | Pine.BSO.4.64.0702170005560.18849@resin.csoft.net Whole thread Raw |
In response to | Re: patch adding new regexp functions (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: patch adding new regexp functions
Re: patch adding new regexp functions |
List | pgsql-patches |
On Sat, 17 Feb 2007, Peter Eisentraut wrote: > Jeremy Drake wrote: > > In case you haven't noticed, I am rather averse to making this return > > text[] because it is much easier in my experience to use the results > > when returned in SETOF rather than text[], > > The primary use case I know for string splitting is parsing > comma/pipe/whatever separated fields into a row structure, and the way > I see it your API proposal makes that exceptionally difficult. For this case see string_to_array: http://developer.postgresql.org/pgdocs/postgres/functions-array.html select string_to_array('a|b|c', '|'); string_to_array ----------------- {a,b,c} (1 row) > I don't know what your use case is, though. All of this is missing > actual use cases. The particular use case I had for this function was at a previous employer, and I am not sure exactly how much detail is appropriate to divulge. Basically, the project was doing some text processing inside of postgres, and getting all of the words from a string into a table with some processing (excluding stopwords and so forth) as efficiently as possible was a big concern. The regexp_split function code was based on some code that a friend of mine wrote which used PCRE rather than postgres' internal regexp support. I don't know exactly what his use-case was, but he probably had one because he wrote the function and had it returning SETOF text ;) Perhaps he can share a general idea of what it was (nudge nudge)? > > While, if you > > really really wanted a text[], you could use the (fully documented) > > ARRAY(select resultstr from regexp_split(...) order by startpos) > > construct. > > I think, however, that we should be providing simple primitives that can > be combined into complex expressions rather than complex primitives > that have to be dissected apart to get simple results. The most simple primitive is string_to_array(text, text) returns text[], but it was not sufficient for our needs. > > > As for the regexp_matches() function, it seems to me that it > > > returns too much information at once. What is the use case for > > > getting all of prematch, fullmatch, matches, and postmatch in one > > > call? > > > > It was requested by David Fetter: > > http://archives.postgresql.org/pgsql-hackers/2007-02/msg00056.php > > > > It was not horribly difficult to provide, and it seemed reasonable to > > me. I have no need for them personally. > > David Fetter has also repeated failed to offer a use case for this, so I > hesitate to accept this. I have no strong opinion either way, so I will let those who do argue it out and wait for the dust to settle ;) -- The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under the bridges, to beg in the streets, and to steal bread. -- Anatole France
pgsql-patches by date: