Thread: string_to_array, array_to_string function without separator

string_to_array, array_to_string function without separator

From
Pavel Stehule
Date:
Hi

I propose mentioned functions without specified separator. In this case the string is transformed to array of chars, in second case, the array of chars is transformed back to string.

Comments, notes?

Regards

Pavel

Re: string_to_array, array_to_string function without separator

From
David Fetter
Date:
On Fri, Mar 15, 2019 at 05:04:02AM +0100, Pavel Stehule wrote:
> Hi
> 
> I propose mentioned functions without specified separator. In this case the
> string is transformed to array of chars, in second case, the array of chars
> is transformed back to string.
> 
> Comments, notes?

Whatever optimizations you have in mind for this, could they also work
for string_to_array() and array_to_string() when they get an empty
string handed to them?

As to naming, some languages use explode/implode.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: string_to_array, array_to_string function without separator

From
Pavel Stehule
Date:


pá 15. 3. 2019 v 15:03 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Mar 15, 2019 at 05:04:02AM +0100, Pavel Stehule wrote:
> Hi
>
> I propose mentioned functions without specified separator. In this case the
> string is transformed to array of chars, in second case, the array of chars
> is transformed back to string.
>
> Comments, notes?

Whatever optimizations you have in mind for this, could they also work
for string_to_array() and array_to_string() when they get an empty
string handed to them?

my idea is use string_to_array('AHOJ') --> {A,H,O,J}

empty input means empty result --> {}
 

As to naming, some languages use explode/implode.

can be, but if we have string_to_array already, I am thinking so it is good name.



Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: string_to_array, array_to_string function without separator

From
Chapman Flack
Date:
On 3/15/19 11:46 AM, Pavel Stehule wrote:
> pá 15. 3. 2019 v 15:03 odesílatel David Fetter <david@fetter.org> napsal:
>> Whatever optimizations you have in mind for this, could they also work
>> for string_to_array() and array_to_string() when they get an empty
>> string handed to them?
> 
> my idea is use string_to_array('AHOJ') --> {A,H,O,J}
> 
> empty input means empty result --> {}

I thought the question was maybe about an empty /delimiter/ string.

It seems that string_to_array already has this behavior if NULL is
passed as the delimiter:

> select string_to_array('AHOJ', null);
 string_to_array
-----------------
 {A,H,O,J}

and array_to_string has the proposed behavior if passed an
empty string as the delimiter (as one would naturally expect)
... but not null for a delimiter (that just makes the result null).

So the proposal seems roughly equivalent to making string_to_array's
second parameter optional default null, and array_to_string's second
parameter optional default ''.

Does that sound right?

Regards,
-Chap


Re: string_to_array, array_to_string function without separator

From
Pavel Stehule
Date:


pá 15. 3. 2019 v 16:59 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 3/15/19 11:46 AM, Pavel Stehule wrote:
> pá 15. 3. 2019 v 15:03 odesílatel David Fetter <david@fetter.org> napsal:
>> Whatever optimizations you have in mind for this, could they also work
>> for string_to_array() and array_to_string() when they get an empty
>> string handed to them?
>
> my idea is use string_to_array('AHOJ') --> {A,H,O,J}
>
> empty input means empty result --> {}

I thought the question was maybe about an empty /delimiter/ string.

It seems that string_to_array already has this behavior if NULL is
passed as the delimiter:

> select string_to_array('AHOJ', null);
 string_to_array
-----------------
 {A,H,O,J}

and array_to_string has the proposed behavior if passed an
empty string as the delimiter (as one would naturally expect)
... but not null for a delimiter (that just makes the result null).

So the proposal seems roughly equivalent to making string_to_array's
second parameter optional default null, and array_to_string's second
parameter optional default ''.

Does that sound right?

yes

Pavel


Regards,
-Chap

Re: string_to_array, array_to_string function without separator

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> So the proposal seems roughly equivalent to making string_to_array's
> second parameter optional default null, and array_to_string's second
> parameter optional default ''.

In that case why bother?  It'll just create a cross-version compatibility
hazard for next-to-no keystroke savings.  If the cases were so common
that they could be argued to be sane "default" behavior, I might feel
differently --- but if you were asked in a vacuum what the default
delimiters ought to be, I don't think you'd say "no delimiter".

            regards, tom lane


Re: string_to_array, array_to_string function without separator

From
Pavel Stehule
Date:


pá 15. 3. 2019 v 17:16 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Chapman Flack <chap@anastigmatix.net> writes:
> So the proposal seems roughly equivalent to making string_to_array's
> second parameter optional default null, and array_to_string's second
> parameter optional default ''.

In that case why bother?  It'll just create a cross-version compatibility
hazard for next-to-no keystroke savings.  If the cases were so common
that they could be argued to be sane "default" behavior, I might feel
differently --- but if you were asked in a vacuum what the default
delimiters ought to be, I don't think you'd say "no delimiter".

My motivation is following - sometimes I need to convert string to array of chars. Using NULL as separator is possible, but it is not intuitive. When you use string_to_array function without separator, then only one possible semantic is there - separation by chars.

I understand so there is a possible collision and possible meaning of missing parameter like default value. But in this case this meaning, semantic is not practical.

Regards

Pavel


                        regards, tom lane

Re: string_to_array, array_to_string function without separator

From
Chapman Flack
Date:
On 3/15/19 12:15 PM, Tom Lane wrote:
> Chapman Flack <chap@anastigmatix.net> writes:
>> So the proposal seems roughly equivalent to making string_to_array's
>> second parameter optional default null, and array_to_string's second
>> parameter optional default ''.
> 
> In that case why bother?  It'll just create a cross-version compatibility
> hazard for next-to-no keystroke savings.  If the cases were so common
> that they could be argued to be sane "default" behavior, I might feel
> differently --- but if you were asked in a vacuum what the default
> delimiters ought to be, I don't think you'd say "no delimiter".

One could go further and argue that the non-optional arguments improve
clarity: a reader seeing the explicit NULL or '' argument gets a strong
clue what's intended, who in the optional-argument case might end up
thinking "must go look up what this function's default delimiter is".

-Chap


Re: string_to_array, array_to_string function without separator

From
Chapman Flack
Date:
On 3/15/19 12:26 PM, Pavel Stehule wrote:
> you use string_to_array function without separator, then only one possible
> semantic is there - separation by chars.

Other languages can and do specify other semantics for the
separator-omitted case: often (as in Python) it means to split
around "runs of one or more characters the platform considers white
space", as a convenience, given that it's a fairly commonly wanted
meaning but can be tedious to spell out as an explicit separator.

I admit I think a separator of '' would be more clear than null,
so if I were designing string_to_array in a green field, I think
I would swap the meanings of null and '' as the delimiter: null
would mean "don't really split anything", and '' would mean "split
everywhere you can find '' in the string", that is, everywhere.

But the current behavior is already established....

Regards,
-Chap


Re: string_to_array, array_to_string function without separator

From
Pavel Stehule
Date:


pá 15. 3. 2019 v 17:54 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 3/15/19 12:26 PM, Pavel Stehule wrote:
> you use string_to_array function without separator, then only one possible
> semantic is there - separation by chars.

Other languages can and do specify other semantics for the
separator-omitted case: often (as in Python) it means to split
around "runs of one or more characters the platform considers white
space", as a convenience, given that it's a fairly commonly wanted
meaning but can be tedious to spell out as an explicit separator.

for this proposal "char" != byte

result[n] = substring(str FROM n FOR 1)


I admit I think a separator of '' would be more clear than null,
so if I were designing string_to_array in a green field, I think
I would swap the meanings of null and '' as the delimiter: null
would mean "don't really split anything", and '' would mean "split
everywhere you can find '' in the string", that is, everywhere.

But the current behavior is already established....

yes

Pavel

Regards,
-Chap

Re: string_to_array, array_to_string function without separator

From
Chapman Flack
Date:
On 3/15/19 12:59 PM, Pavel Stehule wrote:
> for this proposal "char" != byte
> 
> result[n] = substring(str FROM n FOR 1)

I think that's what string_to_array(..., null) already does:

SHOW server_encoding;
server_encoding
UTF8

WITH
 t0(s) AS (SELECT text 'verlorn ist daz slüzzelîn'),
 t1(a) AS (SELECT string_to_array(s, null) FROM t0)
SELECT
 char_length(s), octet_length(convert_to(s, 'UTF8')),
 array_length(a,1), a
FROM
 t0, t1;

char_length|octet_length|array_length|a
25|27|25|{v,e,r,l,o,r,n," ",i,s,t," ",d,a,z," ",s,l,ü,z,z,e,l,î,n}


Regards,
-Chap


Re: string_to_array, array_to_string function without separator

From
Pavel Stehule
Date:


pá 15. 3. 2019 v 18:30 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 3/15/19 12:59 PM, Pavel Stehule wrote:
> for this proposal "char" != byte
>
> result[n] = substring(str FROM n FOR 1)

I think that's what string_to_array(..., null) already does:

sure. My proposal is +/- just reduction about null parameter.



SHOW server_encoding;
server_encoding
UTF8

WITH
 t0(s) AS (SELECT text 'verlorn ist daz slüzzelîn'),
 t1(a) AS (SELECT string_to_array(s, null) FROM t0)
SELECT
 char_length(s), octet_length(convert_to(s, 'UTF8')),
 array_length(a,1), a
FROM
 t0, t1;

char_length|octet_length|array_length|a
25|27|25|{v,e,r,l,o,r,n," ",i,s,t," ",d,a,z," ",s,l,ü,z,z,e,l,î,n}


Regards,
-Chap

Re: string_to_array, array_to_string function without separator

From
David Fetter
Date:
On Fri, Mar 15, 2019 at 12:31:21PM -0400, Chapman Flack wrote:
> On 3/15/19 12:15 PM, Tom Lane wrote:
> > Chapman Flack <chap@anastigmatix.net> writes:
> >> So the proposal seems roughly equivalent to making string_to_array's
> >> second parameter optional default null, and array_to_string's second
> >> parameter optional default ''.
> > 
> > In that case why bother?  It'll just create a cross-version compatibility
> > hazard for next-to-no keystroke savings.  If the cases were so common
> > that they could be argued to be sane "default" behavior, I might feel
> > differently --- but if you were asked in a vacuum what the default
> > delimiters ought to be, I don't think you'd say "no delimiter".
> 
> One could go further and argue that the non-optional arguments improve
> clarity: a reader seeing the explicit NULL or '' argument gets a strong
> clue what's intended, who in the optional-argument case might end up
> thinking "must go look up what this function's default delimiter is".

Going to look up the function's behavior would be much more fun if
there were comments on these functions explaining things.  I'll draft
up a patch for some of that.

In a similar vein, I haven't been able to come up with hazards of
naming function parameters in some document-ish way. What did I miss?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate