Thread: patch: to_string, to_array functions
Hello attached patch contains to_string and to_array functions. These functions are equivalent of array_to_string and string_to_array function with maybe more correct NULL handling. postgres=# select to_array('1,2,3,4,,6',','); to_array ------------------ {1,2,3,4,NULL,6} (1 row) postgres=# select to_array('1,2,3,4,,6',',','***'); to_array ---------------- {1,2,3,4,"",6} (1 row) postgres=# select to_string(array[1,2,3,4,NULL,6],','); to_string ------------ 1,2,3,4,,6 (1 row) postgres=# select to_string(array[1,2,3,4,NULL,6],',','***'); to_string --------------- 1,2,3,4,***,6 (1 row) Regards Pavel Stehule
Attachment
https://commitfest.postgresql.org/action/patch_view?id=300 Why did you add to_string() and to_array() functions though we already have string_to_array() and array_to_string() functions? I prefer adding three arguments version of string_to_array() instead of to_array(). Please notice me if you think to_string() and to_array() are better names for the feature. For example, compatibility for other databases. * string_to_array( str text, sep text, nullstr text DEFAULT NULL ) is compatible with the existing string_to_array( str, sep ), and "nullstr => 'NULL'" will be same as your to_array(). * array_to_string( arr anyarray, sep text, nullstr text DEFAULT NULL ) is compatible with the existing array_to_string(); separator also ignored when nullstr is NULL. "nullstr => ''" (an empty string) will be same as your to_array(). -- Itagaki Takahiro
2010/7/12 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > https://commitfest.postgresql.org/action/patch_view?id=300 > > Why did you add to_string() and to_array() functions though we already > have string_to_array() and array_to_string() functions? I prefer adding > three arguments version of string_to_array() instead of to_array(). > Please notice me if you think to_string() and to_array() are better names > for the feature. For example, compatibility for other databases. > I prefere a new names - because there are a new behave - with little bit better default handling of NULL values. string_to_array and array_to_string just ignore NULL values - what isn't correct behave. Later we can mark these functions as deprecated and remove it. If I use current function, then we have to continue in current behave. > * string_to_array( str text, sep text, nullstr text DEFAULT NULL ) > is compatible with the existing string_to_array( str, sep ), and > "nullstr => 'NULL'" will be same as your to_array(). > > * array_to_string( arr anyarray, sep text, nullstr text DEFAULT NULL ) > is compatible with the existing array_to_string(); separator also ignored > when nullstr is NULL. "nullstr => ''" (an empty string) will be same as > your to_array(). > so reason for these new names are different default behave. And we can't to change of default behave of existing functions. Regards Pavel Stehule > -- > Itagaki Takahiro >
2010/7/12 Pavel Stehule <pavel.stehule@gmail.com>: > I prefere a new names - because there are a new behave - with little > bit better default handling of NULL values. string_to_array and > array_to_string just ignore NULL values - what isn't correct behave. > Later we can mark these functions as deprecated and remove it. If I > use current function, then we have to continue in current behave. I prefer existing names because your new default behavior can be done with suitable nullstr values. IMHO, new names will be acceptable only if they are listed in the SQL-standard or many other databases use the names. Two similar versions of functions must confuse users. Also, are there any consensus about "existing functions are not correct" ? Since string_agg() and your new concat() functions ignores NULLs, I think it is not so bad for array_to_string() to ignore NULLs. -- Itagaki Takahiro
some note 2010/7/12 Pavel Stehule <pavel.stehule@gmail.com>: > 2010/7/12 Itagaki Takahiro <itagaki.takahiro@gmail.com>: >> https://commitfest.postgresql.org/action/patch_view?id=300 >> >> Why did you add to_string() and to_array() functions though we already >> have string_to_array() and array_to_string() functions? I prefer adding >> three arguments version of string_to_array() instead of to_array(). >> Please notice me if you think to_string() and to_array() are better names >> for the feature. For example, compatibility for other databases. >> > > I prefere a new names - because there are a new behave - with little > bit better default handling of NULL values. string_to_array and > array_to_string just ignore NULL values - what isn't correct behave. it is related to time where pg arrays doesn't support a NULL. From 8.3 pg array can have a NULL values, but there wasn't any equal changes to string_to_array and array_to_string functions - so these functions are not "actual". pavel > Later we can mark these functions as deprecated and remove it. If I > use current function, then we have to continue in current behave. > >> * string_to_array( str text, sep text, nullstr text DEFAULT NULL ) >> is compatible with the existing string_to_array( str, sep ), and >> "nullstr => 'NULL'" will be same as your to_array(). >> >> * array_to_string( arr anyarray, sep text, nullstr text DEFAULT NULL ) >> is compatible with the existing array_to_string(); separator also ignored >> when nullstr is NULL. "nullstr => ''" (an empty string) will be same as >> your to_array(). >> > > so reason for these new names are different default behave. And we > can't to change of default behave of existing functions. > > Regards > > Pavel Stehule > > > >> -- >> Itagaki Takahiro >> >
2010/7/12 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > 2010/7/12 Pavel Stehule <pavel.stehule@gmail.com>: >> I prefere a new names - because there are a new behave - with little >> bit better default handling of NULL values. string_to_array and >> array_to_string just ignore NULL values - what isn't correct behave. >> Later we can mark these functions as deprecated and remove it. If I >> use current function, then we have to continue in current behave. > > I prefer existing names because your new default behavior can be done > with suitable nullstr values. IMHO, new names will be acceptable only if > they are listed in the SQL-standard or many other databases use the > names. Two similar versions of functions must confuse users. there is different default behave. So if you don't need to use a third argument > > Also, are there any consensus about "existing functions are not correct" ? > Since string_agg() and your new concat() functions ignores NULLs, > I think it is not so bad for array_to_string() to ignore NULLs. string_agg is a aggregate function - there are NULLS ignored usually, concat simulate MySQL behave - and more, there are not problem to use a coalesce function. string_to_arrays and array_to string are different - there you cannot use a coalesce. Why string_to_array and array_to_strings are not correct? a) what is correct sample of using a array_to_string with NULL ignoring?? Usually, when you have a NULL in array, you don't want to loose this value. b) for me - these functions are some of serialisation/deserialisation functions - usually people don't want to miss any value. I searching in history - my first proposal was similar to your: http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151474.html http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151503.html !! if you look on this thread, you can see so I was unsure and confused too - but now I inclinded to Merlin's proposal shortly: * string_to_array/array_to_string ignore nulls * others not aggregates not ignore nulls * default for NULL isn't "NULL" but empty string - like csv regards Pavel Stěhule > > -- > Itagaki Takahiro >
On 6 May 2010 04:42, Pavel Stehule <pavel.stehule@gmail.com> wrote: > attached patch contains to_string and to_array functions. These > functions are equivalent of array_to_string and string_to_array > function with maybe more correct NULL handling. Hi Pavel, I am reviewing your patch for the commitfest. Overall the patch looks good, although there were some bogus whitespace changes in the patch and some messy punctuation/grammar in some of the code comments. I also thought it was worth mentioning in the docs the default value for null_string is ''. I made an attempt to clean those items up and have attached a v2 of the patch. Regarding the behaviour of the third argument (null_string), I was a little surprised by the results when I passed in a NULL. postgres=# select to_string(array['a', 'b', 'c', 'd'], '/', NULL); to_string ----------- Now, if the array had some NULL elements in it, I could understand why the resulting string would be NULL (because str || NULL is NULL), but in this case there are no NULLs. Why is the result NULL? Surely it should be 'a/b/c/d' regardless of how the third parameter is set? In the reverse case: postgres=# select to_array('a/b/c/d', '/', NULL); to_array ---------- (1 row) Again I find this a bit weird. I have left the null_string NULL, which means it is unknown. It can't possibly match any value in the string, so effectively passing in a NULL null_string should mean that the user doesn't want any string items whatsoever to translate into NULLs in the resulting array. I would expect this call to return {a,b,c,d}. Cheers, BJ
Attachment
Hello 2010/7/16 Brendan Jurd <direvus@gmail.com>: > On 6 May 2010 04:42, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> attached patch contains to_string and to_array functions. These >> functions are equivalent of array_to_string and string_to_array >> function with maybe more correct NULL handling. > > Hi Pavel, > > I am reviewing your patch for the commitfest. > > Overall the patch looks good, although there were some bogus > whitespace changes in the patch and some messy punctuation/grammar in > some of the code comments. I also thought it was worth mentioning in > the docs the default value for null_string is ''. I made an attempt > to clean those items up and have attached a v2 of the patch. > > Regarding the behaviour of the third argument (null_string), I was a > little surprised by the results when I passed in a NULL. > > postgres=# select to_string(array['a', 'b', 'c', 'd'], '/', NULL); > to_string > ----------- > > Now, if the array had some NULL elements in it, I could understand why > the resulting string would be NULL (because str || NULL is NULL), but > in this case there are no NULLs. Why is the result NULL? Surely it > should be 'a/b/c/d' regardless of how the third parameter is set? > > In the reverse case: > > postgres=# select to_array('a/b/c/d', '/', NULL); > to_array > ---------- > > (1 row) > I didn't thinking about NULL as separator before. Current behave isn't practical. When default separator is empty string, then NULL can be used as ignore NULLs - so it can emulate current string_to_array and array_to_string behave. It can be, because NULL can't be a separator ever. select to_string(array[1,2,3,null,5], ',') -> 1,2,3,,5 select to_string(array[1,2,3,null,5], ',', null) -> 1,2,3,5 maybe - next idea and maybe better - we can check NOT NULL for separator and to add other parameter with default = false - ignore_null select to_string(array[1,2,3,null,5], ',', ignore_null := true) -> 1,2,3,5 what do you think? Regards Pavel > Again I find this a bit weird. I have left the null_string NULL, > which means it is unknown. It can't possibly match any value in the > string, so effectively passing in a NULL null_string should mean that > the user doesn't want any string items whatsoever to translate into > NULLs in the resulting array. I would expect this call to return > {a,b,c,d}. > > Cheers, > BJ >
On 17 July 2010 02:15, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/7/16 Brendan Jurd <direvus@gmail.com>: >> Regarding the behaviour of the third argument (null_string), I was a >> little surprised by the results when I passed in a NULL. >> > > I didn't thinking about NULL as separator before. Current behave isn't > practical. When default separator is empty string, then NULL can be > used as ignore NULLs - so it can emulate current string_to_array and > array_to_string behave. It can be, because NULL can't be a separator > ever. > > select to_string(array[1,2,3,null,5], ',') -> 1,2,3,,5 > select to_string(array[1,2,3,null,5], ',', null) -> 1,2,3,5 > > maybe - next idea and maybe better - we can check NOT NULL for > separator and to add other parameter with default = false - > ignore_null > > select to_string(array[1,2,3,null,5], ',', ignore_null := true) -> 1,2,3,5 > > what do you think? I don't have any problem with null_string = NULL in to_string taking the meaning "skip over NULL elements". It's a slightly strange outcome but it's more useful than returning NULL, and I do like that it gives us a path to the current array_to_string() treatment even if those functions are ultimately deprecated. I think adding a fourth keyword argument might be sacrificing a little too much convenience in the calling convention. As for to_array, null_string = NULL should mean that there is no string which should result in a NULL element. So I would be happy to see the following set of behaviours: to_string(array[1, 2, 3, 4, 5], ',', null) = '1,2,3,4,5' to_string(array[1, 2, 3, null, 5], ',', null) = '1,2,3,5' to_array('1,2,3,,5', ',', null) = '{1,2,3,"",5}' Also, if we're going to make the function non-strict, we need to consider how to respond when the user specifies NULL for the other arguments. If the field separator is NULL, bearing in mind that NULL can't match any string, I would expect that to_array would return the undivided string as a single array element, and that to_string would throw an error: to_array('1,2,3,4,5', null) = '{"1,2,3,4,5"}' to_string(array[1,2,3,4,5], null) = ERROR: the field separator for to_string may not be NULL If the first argument is NULL for either function, I think it would be reasonable to return NULL. But I could be convinced that we should throw an error in that case too. Cheers, BJ
2010/7/16 Brendan Jurd <direvus@gmail.com>: > On 17 July 2010 02:15, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2010/7/16 Brendan Jurd <direvus@gmail.com>: >>> Regarding the behaviour of the third argument (null_string), I was a >>> little surprised by the results when I passed in a NULL. >>> >> >> I didn't thinking about NULL as separator before. Current behave isn't >> practical. When default separator is empty string, then NULL can be >> used as ignore NULLs - so it can emulate current string_to_array and >> array_to_string behave. It can be, because NULL can't be a separator >> ever. >> >> select to_string(array[1,2,3,null,5], ',') -> 1,2,3,,5 >> select to_string(array[1,2,3,null,5], ',', null) -> 1,2,3,5 >> >> maybe - next idea and maybe better - we can check NOT NULL for >> separator and to add other parameter with default = false - >> ignore_null >> >> select to_string(array[1,2,3,null,5], ',', ignore_null := true) -> 1,2,3,5 >> >> what do you think? > > I don't have any problem with null_string = NULL in to_string taking > the meaning "skip over NULL elements". It's a slightly strange > outcome but it's more useful than returning NULL, and I do like that > it gives us a path to the current array_to_string() treatment even if > those functions are ultimately deprecated. I think adding a fourth > keyword argument might be sacrificing a little too much convenience in > the calling convention. > > As for to_array, null_string = NULL should mean that there is no > string which should result in a NULL element. So I would be happy to > see the following set of behaviours: > > to_string(array[1, 2, 3, 4, 5], ',', null) = '1,2,3,4,5' > to_string(array[1, 2, 3, null, 5], ',', null) = '1,2,3,5' > to_array('1,2,3,,5', ',', null) = '{1,2,3,"",5}' > > Also, if we're going to make the function non-strict, we need to > consider how to respond when the user specifies NULL for the other > arguments. If the field separator is NULL, bearing in mind that NULL > can't match any string, I would expect that to_array would return the > undivided string as a single array element, and that to_string would > throw an error: > ok, it has a sense. other question is empty string as separator - but I think, it can has same behave like string_to_array and array_to_string functions. > to_array('1,2,3,4,5', null) = '{"1,2,3,4,5"}' > to_string(array[1,2,3,4,5], null) = ERROR: the field separator for > to_string may not be NULL > > If the first argument is NULL for either function, I think it would be > reasonable to return NULL. But I could be convinced that we should > throw an error in that case too. > I agree - I prefer a NULL Thank You very much Pavel > Cheers, > BJ >
On 17 July 2010 04:52, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/7/16 Brendan Jurd <direvus@gmail.com>: >> Also, if we're going to make the function non-strict, we need to >> consider how to respond when the user specifies NULL for the other >> arguments. If the field separator is NULL, bearing in mind that NULL >> can't match any string, I would expect that to_array would return the >> undivided string as a single array element, and that to_string would >> throw an error: >> > > ok, it has a sense. > > other question is empty string as separator - but I think, it can has > same behave like string_to_array and array_to_string functions. > Agreed. Those behaviours seem sensible. >> If the first argument is NULL for either function, I think it would be >> reasonable to return NULL. But I could be convinced that we should >> throw an error in that case too. >> > > I agree - I prefer a NULL > > Thank You very much No worries; I will await a revised patch from you which updates these behaviours -- please incorporate the doc/comment changes I posted earlier -- I will then do a further review before handing off to a committer. Cheers, BJ
Hello here is a new version - new these functions are not a strict and function to_string is marked as stable. both functions share code with older version. Regards Pavel 2010/7/16 Brendan Jurd <direvus@gmail.com>: > On 17 July 2010 04:52, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2010/7/16 Brendan Jurd <direvus@gmail.com>: >>> Also, if we're going to make the function non-strict, we need to >>> consider how to respond when the user specifies NULL for the other >>> arguments. If the field separator is NULL, bearing in mind that NULL >>> can't match any string, I would expect that to_array would return the >>> undivided string as a single array element, and that to_string would >>> throw an error: >>> >> >> ok, it has a sense. >> >> other question is empty string as separator - but I think, it can has >> same behave like string_to_array and array_to_string functions. >> > > Agreed. Those behaviours seem sensible. > >>> If the first argument is NULL for either function, I think it would be >>> reasonable to return NULL. But I could be convinced that we should >>> throw an error in that case too. >>> >> >> I agree - I prefer a NULL >> >> Thank You very much > > No worries; I will await a revised patch from you which updates these > behaviours -- please incorporate the doc/comment changes I posted > earlier -- I will then do a further review before handing off to a > committer. > > Cheers, > BJ >
Attachment
2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: > here is a new version - new these functions are not a strict and > function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? If to_string and to_array is in the SQL standard, we can accept the name changes. But if there are no standard, I'd like to keep the existing function names. -- Itagaki Takahiro
2010/7/21 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >> here is a new version - new these functions are not a strict and >> function to_string is marked as stable. > > We have array_to_string(anyarray, text) and string_to_array(text, text), > and you'll introduce to_string(anyarray, text, text) and > to_array(text, text, text). I have to repeat it, the behave of this functions are little bit different. string_to_array and array_to_string are buggy. * it isn't support a NULL * it doesn't differentiate a empty array and NULL * we cannot to change default behave of existing functions * array_to_string is badly marked as IMMUTABLE > Do we think it is good idea to have different names for them? IMHO, we'd > better use 3 arguments version of array_to_string() instead of the > new to_string() ? > > If to_string and to_array is in the SQL standard, we can accept the > name changes. > But if there are no standard, I'd like to keep the existing function names. > no it isn't in standard, but I am thinking, so we have to gently alone a old functions Regards Pavel Stehule > -- > Itagaki Takahiro >
On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >> here is a new version - new these functions are not a strict and >> function to_string is marked as stable. > > We have array_to_string(anyarray, text) and string_to_array(text, text), > and you'll introduce to_string(anyarray, text, text) and > to_array(text, text, text). > Do we think it is good idea to have different names for them? IMHO, we'd > better use 3 arguments version of array_to_string() instead of the > new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
2010/7/21 Robert Haas <robertmhaas@gmail.com>: > On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro > <itagaki.takahiro@gmail.com> wrote: >> 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >>> here is a new version - new these functions are not a strict and >>> function to_string is marked as stable. >> >> We have array_to_string(anyarray, text) and string_to_array(text, text), >> and you'll introduce to_string(anyarray, text, text) and >> to_array(text, text, text). >> Do we think it is good idea to have different names for them? IMHO, we'd >> better use 3 arguments version of array_to_string() instead of the >> new to_string() ? > > The worst part is that the new names are not very mnemonic. > > I think maybe what we really need here is array equivalents of > COALESCE() and NULLIF(). It looks like the proposed to_string() > function is basically equivalent to replacing each NULL entry with the > array with a given value, and then doing array_to_string() as usual. > And it looks like the proposed to_array function basically does the > same thing as to_array(), and then replaces empty strings with NULL or > some other value. > > Maybe we just need a function array_replace(anyarray, anyelement, > anyelement) that replaces any element in the array that IS NOT > DISTINCT FROM $2 with $3 and returns the new array. That could be > useful for other things besides this particular case, too. > I don't agree. Building or updating any array is little bit expensive. There can be same performance issue like combination array_agg and array_to_string versus string_agg. I am not against to possible name changes. But I am strong in opinion so current string_to_array and array_to_string are buggy and have to be deprecated. Regards Pavel p.s. can we use a names - text_to_array, array_to_text ? > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >> On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro >> <itagaki.takahiro@gmail.com> wrote: >>> 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >>>> here is a new version - new these functions are not a strict and >>>> function to_string is marked as stable. >>> >>> We have array_to_string(anyarray, text) and string_to_array(text, text), >>> and you'll introduce to_string(anyarray, text, text) and >>> to_array(text, text, text). >>> Do we think it is good idea to have different names for them? IMHO, we'd >>> better use 3 arguments version of array_to_string() instead of the >>> new to_string() ? >> >> The worst part is that the new names are not very mnemonic. >> >> I think maybe what we really need here is array equivalents of >> COALESCE() and NULLIF(). It looks like the proposed to_string() >> function is basically equivalent to replacing each NULL entry with the >> array with a given value, and then doing array_to_string() as usual. >> And it looks like the proposed to_array function basically does the >> same thing as to_array(), and then replaces empty strings with NULL or >> some other value. >> >> Maybe we just need a function array_replace(anyarray, anyelement, >> anyelement) that replaces any element in the array that IS NOT >> DISTINCT FROM $2 with $3 and returns the new array. That could be >> useful for other things besides this particular case, too. > > I don't agree. Building or updating any array is little bit expensive. > There can be same performance issue like combination array_agg and > array_to_string versus string_agg. But is it really bad enough to introduce custom versions of every function that might want to do this sort of thing? > I am not against to possible name > changes. But I am strong in opinion so current string_to_array and > array_to_string are buggy and have to be deprecated. But I don't think anyone else agrees with you. The current behavior isn't the only one anyone might want, but it's one reasonable behavior. > p.s. can we use a names - text_to_array, array_to_text ? That's not going to reduce confusion one bit... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
2010/7/21 Robert Haas <robertmhaas@gmail.com>: > On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>> On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro >>> <itagaki.takahiro@gmail.com> wrote: >>>> 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >>>>> here is a new version - new these functions are not a strict and >>>>> function to_string is marked as stable. >>>> >>>> We have array_to_string(anyarray, text) and string_to_array(text, text), >>>> and you'll introduce to_string(anyarray, text, text) and >>>> to_array(text, text, text). >>>> Do we think it is good idea to have different names for them? IMHO, we'd >>>> better use 3 arguments version of array_to_string() instead of the >>>> new to_string() ? >>> >>> The worst part is that the new names are not very mnemonic. >>> >>> I think maybe what we really need here is array equivalents of >>> COALESCE() and NULLIF(). It looks like the proposed to_string() >>> function is basically equivalent to replacing each NULL entry with the >>> array with a given value, and then doing array_to_string() as usual. >>> And it looks like the proposed to_array function basically does the >>> same thing as to_array(), and then replaces empty strings with NULL or >>> some other value. >>> >>> Maybe we just need a function array_replace(anyarray, anyelement, >>> anyelement) that replaces any element in the array that IS NOT >>> DISTINCT FROM $2 with $3 and returns the new array. That could be >>> useful for other things besides this particular case, too. >> >> I don't agree. Building or updating any array is little bit expensive. >> There can be same performance issue like combination array_agg and >> array_to_string versus string_agg. > > But is it really bad enough to introduce custom versions of every > function that might want to do this sort of thing? > >> I am not against to possible name >> changes. But I am strong in opinion so current string_to_array and >> array_to_string are buggy and have to be deprecated. > > But I don't think anyone else agrees with you. The current behavior > isn't the only one anyone might want, but it's one reasonable > behavior. see on discus to these function - this is Marlin Moncure proposal http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151503.html these functions was designed in reaction to reporting bugs and problems with serialisation and deserialisation of arrays with null fields. you can't to parse string to array with null values now postgres=# select string_to_array('1,2,3,null,5',',')::int[]; ERROR: invalid input syntax for integer: "null" postgres=# Regards Pavel Stehule > >> p.s. can we use a names - text_to_array, array_to_text ? > > That's not going to reduce confusion one bit... > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
2010/7/21 Pavel Stehule <pavel.stehule@gmail.com>: > 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >> On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>>> On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro >>>> <itagaki.takahiro@gmail.com> wrote: >>>>> 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >>>>>> here is a new version - new these functions are not a strict and >>>>>> function to_string is marked as stable. >>>>> >>>>> We have array_to_string(anyarray, text) and string_to_array(text, text), >>>>> and you'll introduce to_string(anyarray, text, text) and >>>>> to_array(text, text, text). >>>>> Do we think it is good idea to have different names for them? IMHO, we'd >>>>> better use 3 arguments version of array_to_string() instead of the >>>>> new to_string() ? >>>> >>>> The worst part is that the new names are not very mnemonic. >>>> >>>> I think maybe what we really need here is array equivalents of >>>> COALESCE() and NULLIF(). It looks like the proposed to_string() >>>> function is basically equivalent to replacing each NULL entry with the >>>> array with a given value, and then doing array_to_string() as usual. >>>> And it looks like the proposed to_array function basically does the >>>> same thing as to_array(), and then replaces empty strings with NULL or >>>> some other value. >>>> >>>> Maybe we just need a function array_replace(anyarray, anyelement, >>>> anyelement) that replaces any element in the array that IS NOT >>>> DISTINCT FROM $2 with $3 and returns the new array. That could be >>>> useful for other things besides this particular case, too. >>> >>> I don't agree. Building or updating any array is little bit expensive. >>> There can be same performance issue like combination array_agg and >>> array_to_string versus string_agg. >> >> But is it really bad enough to introduce custom versions of every >> function that might want to do this sort of thing? please look on http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151475.html I am not alone in opinion so current string to array functions has not good design Regards Pavel >> >>> I am not against to possible name >>> changes. But I am strong in opinion so current string_to_array and >>> array_to_string are buggy and have to be deprecated. >> >> But I don't think anyone else agrees with you. The current behavior >> isn't the only one anyone might want, but it's one reasonable >> behavior. > > see on discus to these function - this is Marlin Moncure proposal > > http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151503.html > > these functions was designed in reaction to reporting bugs and > problems with serialisation and deserialisation of arrays with null > fields. > > you can't to parse string to array with null values now > > postgres=# select string_to_array('1,2,3,null,5',',')::int[]; > ERROR: invalid input syntax for integer: "null" > postgres=# > > Regards > > Pavel Stehule >> >>> p.s. can we use a names - text_to_array, array_to_text ? >> >> That's not going to reduce confusion one bit... >> >> -- >> Robert Haas >> EnterpriseDB: http://www.enterprisedb.com >> The Enterprise Postgres Company >> >
On Wed, Jul 21, 2010 at 8:14 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/7/21 Pavel Stehule <pavel.stehule@gmail.com>: >> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>> On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>>>> On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro >>>>> <itagaki.takahiro@gmail.com> wrote: >>>>>> 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: >>>>>>> here is a new version - new these functions are not a strict and >>>>>>> function to_string is marked as stable. >>>>>> >>>>>> We have array_to_string(anyarray, text) and string_to_array(text, text), >>>>>> and you'll introduce to_string(anyarray, text, text) and >>>>>> to_array(text, text, text). >>>>>> Do we think it is good idea to have different names for them? IMHO, we'd >>>>>> better use 3 arguments version of array_to_string() instead of the >>>>>> new to_string() ? >>>>> >>>>> The worst part is that the new names are not very mnemonic. >>>>> >>>>> I think maybe what we really need here is array equivalents of >>>>> COALESCE() and NULLIF(). It looks like the proposed to_string() >>>>> function is basically equivalent to replacing each NULL entry with the >>>>> array with a given value, and then doing array_to_string() as usual. >>>>> And it looks like the proposed to_array function basically does the >>>>> same thing as to_array(), and then replaces empty strings with NULL or >>>>> some other value. >>>>> >>>>> Maybe we just need a function array_replace(anyarray, anyelement, >>>>> anyelement) that replaces any element in the array that IS NOT >>>>> DISTINCT FROM $2 with $3 and returns the new array. That could be >>>>> useful for other things besides this particular case, too. >>>> >>>> I don't agree. Building or updating any array is little bit expensive. >>>> There can be same performance issue like combination array_agg and >>>> array_to_string versus string_agg. >>> >>> But is it really bad enough to introduce custom versions of every >>> function that might want to do this sort of thing? > > please look on http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151475.html > > I am not alone in opinion so current string to array functions has > not good design OK, I stand corrected, although I'm not totally convinced. I still think to_array() and to_string() are not a good choice of names. I am not sure if we should reuse the existing names (adding a third parameter) or pick something else, like array_concat() and split_to_array(). Also, should we consider putting these in contrib/stringfunc rather than core? Or is there enough support for core that we should stick with doing it that way? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
> OK, I stand corrected, although I'm not totally convinced. I still > think to_array() and to_string() are not a good choice of names. I am > not sure if we should reuse the existing names (adding a third > parameter) or pick something else, like array_concat() and > split_to_array(). > It was discussed before. I would to see some symmetry in names. The bad thing is so great names like string_to_array and array_to_string is used, and second bad thing was done three years ago when nobody thinking about NULL values. I don't think, so we are able to repair older functions - simply the default behave isn't optimal. I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. > Also, should we consider putting these in contrib/stringfunc rather > than core? Or is there enough support for core that we should stick > with doing it that way? > so it is one variant. I am not against to moving these function to contrib/stringfunc. I am thinking, so we have to solve question about marking string_to_array and array_to_string functions as deprecated first. Then we can move forward?? My opinion is known - I am for removing of these function in future and replacing by modernized functions. Others opinions??? Can we move forward? Regards Pavel > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > It was discussed before. I would to see some symmetry in names. That's reasonable. > The > bad thing is so great names like string_to_array and array_to_string > is used, Yeah, those names are not too good. > and second bad thing was done three years ago when nobody > thinking about NULL values. I don't think, so we are able to repair > older functions - simply the default behave isn't optimal. This is a matter of opinion, but certainly it's not right for everyone. > I am thinking so we have to do decision about string_to_array and > array_to_string deprecation first. If these function will be > deprecated, then we can use a similar names (and probably we should to > use a similar names) - so text_to_array or array_to_string can be > acceptable. If not, then this discus is needless - then to_string and > to_array have to be maximally in contrib - stringfunc is good idea - > and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On 22 July 2010 01:55, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> I am thinking so we have to do decision about string_to_array and >> array_to_string deprecation first. > > Well, -1 from me for deprecating string_to_array and array_to_string. > For what it's worth, I agree with Pavel about the current behaviour in core. It's broken whenever NULLs come into play. We need to improve on this one way or another, and I think it would be a shame to deal with a problem in core by adding something to contrib. > I am not in favor of the names to_string and to_array even if we put > them in contrib, though. The problem with string_to_array and > array_to_string is that they aren't descriptive enough, and > to_string/to_array is even less so. What about implode() and explode()? It's got symmetry and it's possibly more descriptive. Cheers, BJ
2010/7/21 Robert Haas <robertmhaas@gmail.com>: > On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> It was discussed before. I would to see some symmetry in names. > > That's reasonable. > >> The >> bad thing is so great names like string_to_array and array_to_string >> is used, > > Yeah, those names are not too good. > >> and second bad thing was done three years ago when nobody >> thinking about NULL values. I don't think, so we are able to repair >> older functions - simply the default behave isn't optimal. > > This is a matter of opinion, but certainly it's not right for everyone. > >> I am thinking so we have to do decision about string_to_array and >> array_to_string deprecation first. If these function will be >> deprecated, then we can use a similar names (and probably we should to >> use a similar names) - so text_to_array or array_to_string can be >> acceptable. If not, then this discus is needless - then to_string and >> to_array have to be maximally in contrib - stringfunc is good idea - >> and maybe we don't need thinking about new names. > > Well, -1 from me for deprecating string_to_array and array_to_string. > > I am not in favor of the names to_string and to_array even if we put > them in contrib, though. The problem with string_to_array and > array_to_string is that they aren't descriptive enough, and > to_string/to_array is even less so. > I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text->array, array->text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
On Wed, Jul 21, 2010 at 12:08 PM, Brendan Jurd <direvus@gmail.com> wrote: > On 22 July 2010 01:55, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> I am thinking so we have to do decision about string_to_array and >>> array_to_string deprecation first. >> >> Well, -1 from me for deprecating string_to_array and array_to_string. >> > > For what it's worth, I agree with Pavel about the current behaviour in > core. It's broken whenever NULLs come into play. We need to improve > on this one way or another, and I think it would be a shame to deal > with a problem in core by adding something to contrib. Fair enough. I'm OK with putting it in core if we can come up with suitable names. >> I am not in favor of the names to_string and to_array even if we put >> them in contrib, though. The problem with string_to_array and >> array_to_string is that they aren't descriptive enough, and >> to_string/to_array is even less so. > > What about implode() and explode()? It's got symmetry and it's > possibly more descriptive. Hmm, it's a thought. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> I am thinking so we have to do decision about string_to_array and >>> array_to_string deprecation first. If these function will be >>> deprecated, then we can use a similar names (and probably we should to >>> use a similar names) - so text_to_array or array_to_string can be >>> acceptable. If not, then this discus is needless - then to_string and >>> to_array have to be maximally in contrib - stringfunc is good idea - >>> and maybe we don't need thinking about new names. >> >> Well, -1 from me for deprecating string_to_array and array_to_string. >> >> I am not in favor of the names to_string and to_array even if we put >> them in contrib, though. The problem with string_to_array and >> array_to_string is that they aren't descriptive enough, and >> to_string/to_array is even less so. > > I am not a English native speaker, so I have a different feeling. > These functions do array_serialisation and array_deseralisation, but > this names are too long. I have not idea about better names - it is > descriptive well (for me) text->array, array->text - and these names > shows very cleanly symmetry between functions. I have to repeat - it > is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Jul 21, 2010, at 12:30 , Robert Haas wrote: > array_split() and array_join(), following Perl? +1. Seems common in other languages such as Ruby, Python, and Java as well. Michael Glaesemann grzm seespotcode net
2010/7/21 Robert Haas <robertmhaas@gmail.com>: > On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> I am thinking so we have to do decision about string_to_array and >>>> array_to_string deprecation first. If these function will be >>>> deprecated, then we can use a similar names (and probably we should to >>>> use a similar names) - so text_to_array or array_to_string can be >>>> acceptable. If not, then this discus is needless - then to_string and >>>> to_array have to be maximally in contrib - stringfunc is good idea - >>>> and maybe we don't need thinking about new names. >>> >>> Well, -1 from me for deprecating string_to_array and array_to_string. >>> >>> I am not in favor of the names to_string and to_array even if we put >>> them in contrib, though. The problem with string_to_array and >>> array_to_string is that they aren't descriptive enough, and >>> to_string/to_array is even less so. >> >> I am not a English native speaker, so I have a different feeling. >> These functions do array_serialisation and array_deseralisation, but >> this names are too long. I have not idea about better names - it is >> descriptive well (for me) text->array, array->text - and these names >> shows very cleanly symmetry between functions. I have to repeat - it >> is very clean for not native speaker. > > Well, the problem is that array_to_string(), for example, tells you > that an array is being converted to a string, but not how. And > to_string() tells you that you're getting a string, but it doesn't > tell you either what you're getting it from or how you're getting it. > We already have a function to_char() which can be used to format a > whole bunch of different types as strings; I can't see adding a new > function with almost the same name that does something completely > different. > > array_split() and array_join(), following Perl? array_implode() and > array_explode(), along the lines suggested by Brendan? I have a problem with array_split - because there string is split. I looked on net - and languages usually uses a "split" or "join". split is method of str class in Java. So when I am following Perl, I feel better with just only "split" and "join", but "join" is keyword :( - step back, maybe string_split X array_join ? select string_split('1,2,3,4',','); select array_join(array[1,2,3,4],','); so my preferences: 1. split, join - I checked - we are able to create "join" function 2. split, array_join - when only "join" can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Pavel p.s. It is typical use case for packages - with it, we can have the functions string.split() and array.join() > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >> On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>> I am thinking so we have to do decision about string_to_array and >>>>> array_to_string deprecation first. If these function will be >>>>> deprecated, then we can use a similar names (and probably we should to >>>>> use a similar names) - so text_to_array or array_to_string can be >>>>> acceptable. If not, then this discus is needless - then to_string and >>>>> to_array have to be maximally in contrib - stringfunc is good idea - >>>>> and maybe we don't need thinking about new names. >>>> >>>> Well, -1 from me for deprecating string_to_array and array_to_string. >>>> >>>> I am not in favor of the names to_string and to_array even if we put >>>> them in contrib, though. The problem with string_to_array and >>>> array_to_string is that they aren't descriptive enough, and >>>> to_string/to_array is even less so. >>> >>> I am not a English native speaker, so I have a different feeling. >>> These functions do array_serialisation and array_deseralisation, but >>> this names are too long. I have not idea about better names - it is >>> descriptive well (for me) text->array, array->text - and these names >>> shows very cleanly symmetry between functions. I have to repeat - it >>> is very clean for not native speaker. >> >> Well, the problem is that array_to_string(), for example, tells you >> that an array is being converted to a string, but not how. And >> to_string() tells you that you're getting a string, but it doesn't >> tell you either what you're getting it from or how you're getting it. >> We already have a function to_char() which can be used to format a >> whole bunch of different types as strings; I can't see adding a new >> function with almost the same name that does something completely >> different. >> >> array_split() and array_join(), following Perl? array_implode() and >> array_explode(), along the lines suggested by Brendan? > > I have a problem with array_split - because there string is split. I > looked on net - and languages usually uses a "split" or "join". split > is method of str class in Java. So when I am following Perl, I feel > better with just only "split" and "join", but "join" is keyword :( - > step back, maybe string_split X array_join ? > > select string_split('1,2,3,4',','); > select array_join(array[1,2,3,4],','); > > so my preferences: > > 1. split, join - I checked - we are able to create "join" function > 2. split, array_join - when only "join" can be a problem > 3. string_split, array_join - there are not clean symmetry, but it > respect wide used a semantics - string.split, array.join > 4. explode, implode > 5. array_explode, array_implode > -- I cannot to like array_split - it is contradiction for me. Well, I guess I prefer my suggestion to any of those (I know... what a surprise), but I think I could live with #3, #4, or #5. It's hard for me to imagine that we really want to create a function called just join(), given the other meanings that JOIN already has in SQL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
2010/7/21 Robert Haas <robertmhaas@gmail.com>: > On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>> On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>> I am thinking so we have to do decision about string_to_array and >>>>>> array_to_string deprecation first. If these function will be >>>>>> deprecated, then we can use a similar names (and probably we should to >>>>>> use a similar names) - so text_to_array or array_to_string can be >>>>>> acceptable. If not, then this discus is needless - then to_string and >>>>>> to_array have to be maximally in contrib - stringfunc is good idea - >>>>>> and maybe we don't need thinking about new names. >>>>> >>>>> Well, -1 from me for deprecating string_to_array and array_to_string. >>>>> >>>>> I am not in favor of the names to_string and to_array even if we put >>>>> them in contrib, though. The problem with string_to_array and >>>>> array_to_string is that they aren't descriptive enough, and >>>>> to_string/to_array is even less so. >>>> >>>> I am not a English native speaker, so I have a different feeling. >>>> These functions do array_serialisation and array_deseralisation, but >>>> this names are too long. I have not idea about better names - it is >>>> descriptive well (for me) text->array, array->text - and these names >>>> shows very cleanly symmetry between functions. I have to repeat - it >>>> is very clean for not native speaker. >>> >>> Well, the problem is that array_to_string(), for example, tells you >>> that an array is being converted to a string, but not how. And >>> to_string() tells you that you're getting a string, but it doesn't >>> tell you either what you're getting it from or how you're getting it. >>> We already have a function to_char() which can be used to format a >>> whole bunch of different types as strings; I can't see adding a new >>> function with almost the same name that does something completely >>> different. >>> >>> array_split() and array_join(), following Perl? array_implode() and >>> array_explode(), along the lines suggested by Brendan? >> >> I have a problem with array_split - because there string is split. I >> looked on net - and languages usually uses a "split" or "join". split >> is method of str class in Java. So when I am following Perl, I feel >> better with just only "split" and "join", but "join" is keyword :( - >> step back, maybe string_split X array_join ? >> >> select string_split('1,2,3,4',','); >> select array_join(array[1,2,3,4],','); >> >> so my preferences: >> >> 1. split, join - I checked - we are able to create "join" function >> 2. split, array_join - when only "join" can be a problem >> 3. string_split, array_join - there are not clean symmetry, but it >> respect wide used a semantics - string.split, array.join >> 4. explode, implode >> 5. array_explode, array_implode >> -- I cannot to like array_split - it is contradiction for me. > > Well, I guess I prefer my suggestion to any of those (I know... what a > surprise), but I think I could live with #3, #4, or #5. It's hard for > me to imagine that we really want to create a function called just > join(), given the other meanings that JOIN already has in SQL. it hasn't any relation to SQL language - but I don't expect so some like this can be accepted by Tom :). So for this moment we are in agreement on #3, #4, #5. I think, we can wait one or two days for opinions of others - and than I'll fix patch. ok? Regards Pavel > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
On Wed, Jul 21, 2010 at 2:25 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >> On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>>> On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>>> I am thinking so we have to do decision about string_to_array and >>>>>>> array_to_string deprecation first. If these function will be >>>>>>> deprecated, then we can use a similar names (and probably we should to >>>>>>> use a similar names) - so text_to_array or array_to_string can be >>>>>>> acceptable. If not, then this discus is needless - then to_string and >>>>>>> to_array have to be maximally in contrib - stringfunc is good idea - >>>>>>> and maybe we don't need thinking about new names. >>>>>> >>>>>> Well, -1 from me for deprecating string_to_array and array_to_string. >>>>>> >>>>>> I am not in favor of the names to_string and to_array even if we put >>>>>> them in contrib, though. The problem with string_to_array and >>>>>> array_to_string is that they aren't descriptive enough, and >>>>>> to_string/to_array is even less so. >>>>> >>>>> I am not a English native speaker, so I have a different feeling. >>>>> These functions do array_serialisation and array_deseralisation, but >>>>> this names are too long. I have not idea about better names - it is >>>>> descriptive well (for me) text->array, array->text - and these names >>>>> shows very cleanly symmetry between functions. I have to repeat - it >>>>> is very clean for not native speaker. >>>> >>>> Well, the problem is that array_to_string(), for example, tells you >>>> that an array is being converted to a string, but not how. And >>>> to_string() tells you that you're getting a string, but it doesn't >>>> tell you either what you're getting it from or how you're getting it. >>>> We already have a function to_char() which can be used to format a >>>> whole bunch of different types as strings; I can't see adding a new >>>> function with almost the same name that does something completely >>>> different. >>>> >>>> array_split() and array_join(), following Perl? array_implode() and >>>> array_explode(), along the lines suggested by Brendan? >>> >>> I have a problem with array_split - because there string is split. I >>> looked on net - and languages usually uses a "split" or "join". split >>> is method of str class in Java. So when I am following Perl, I feel >>> better with just only "split" and "join", but "join" is keyword :( - >>> step back, maybe string_split X array_join ? >>> >>> select string_split('1,2,3,4',','); >>> select array_join(array[1,2,3,4],','); >>> >>> so my preferences: >>> >>> 1. split, join - I checked - we are able to create "join" function >>> 2. split, array_join - when only "join" can be a problem >>> 3. string_split, array_join - there are not clean symmetry, but it >>> respect wide used a semantics - string.split, array.join >>> 4. explode, implode >>> 5. array_explode, array_implode >>> -- I cannot to like array_split - it is contradiction for me. >> >> Well, I guess I prefer my suggestion to any of those (I know... what a >> surprise), but I think I could live with #3, #4, or #5. It's hard for >> me to imagine that we really want to create a function called just >> join(), given the other meanings that JOIN already has in SQL. > > it hasn't any relation to SQL language - but I don't expect so some > like this can be accepted by Tom :). So for this moment we are in > agreement on #3, #4, #5. I think, we can wait one or two days for > opinions of others - and than I'll fix patch. ok? Yeah, I'd like some more votes, too. Aside from what I suggested (array_join/array_split), I think my favorite is your #5. We might also want to put some work into documentating the differences between the old and new functions clearly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
2010/7/21 Robert Haas <robertmhaas@gmail.com>: > On Wed, Jul 21, 2010 at 2:25 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>> On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>> 2010/7/21 Robert Haas <robertmhaas@gmail.com>: >>>>> On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >>>>>>>> I am thinking so we have to do decision about string_to_array and >>>>>>>> array_to_string deprecation first. If these function will be >>>>>>>> deprecated, then we can use a similar names (and probably we should to >>>>>>>> use a similar names) - so text_to_array or array_to_string can be >>>>>>>> acceptable. If not, then this discus is needless - then to_string and >>>>>>>> to_array have to be maximally in contrib - stringfunc is good idea - >>>>>>>> and maybe we don't need thinking about new names. >>>>>>> >>>>>>> Well, -1 from me for deprecating string_to_array and array_to_string. >>>>>>> >>>>>>> I am not in favor of the names to_string and to_array even if we put >>>>>>> them in contrib, though. The problem with string_to_array and >>>>>>> array_to_string is that they aren't descriptive enough, and >>>>>>> to_string/to_array is even less so. >>>>>> >>>>>> I am not a English native speaker, so I have a different feeling. >>>>>> These functions do array_serialisation and array_deseralisation, but >>>>>> this names are too long. I have not idea about better names - it is >>>>>> descriptive well (for me) text->array, array->text - and these names >>>>>> shows very cleanly symmetry between functions. I have to repeat - it >>>>>> is very clean for not native speaker. >>>>> >>>>> Well, the problem is that array_to_string(), for example, tells you >>>>> that an array is being converted to a string, but not how. And >>>>> to_string() tells you that you're getting a string, but it doesn't >>>>> tell you either what you're getting it from or how you're getting it. >>>>> We already have a function to_char() which can be used to format a >>>>> whole bunch of different types as strings; I can't see adding a new >>>>> function with almost the same name that does something completely >>>>> different. >>>>> >>>>> array_split() and array_join(), following Perl? array_implode() and >>>>> array_explode(), along the lines suggested by Brendan? >>>> >>>> I have a problem with array_split - because there string is split. I >>>> looked on net - and languages usually uses a "split" or "join". split >>>> is method of str class in Java. So when I am following Perl, I feel >>>> better with just only "split" and "join", but "join" is keyword :( - >>>> step back, maybe string_split X array_join ? >>>> >>>> select string_split('1,2,3,4',','); >>>> select array_join(array[1,2,3,4],','); >>>> >>>> so my preferences: >>>> >>>> 1. split, join - I checked - we are able to create "join" function >>>> 2. split, array_join - when only "join" can be a problem >>>> 3. string_split, array_join - there are not clean symmetry, but it >>>> respect wide used a semantics - string.split, array.join >>>> 4. explode, implode >>>> 5. array_explode, array_implode >>>> -- I cannot to like array_split - it is contradiction for me. >>> >>> Well, I guess I prefer my suggestion to any of those (I know... what a >>> surprise), but I think I could live with #3, #4, or #5. It's hard for >>> me to imagine that we really want to create a function called just >>> join(), given the other meanings that JOIN already has in SQL. >> >> it hasn't any relation to SQL language - but I don't expect so some >> like this can be accepted by Tom :). So for this moment we are in >> agreement on #3, #4, #5. I think, we can wait one or two days for >> opinions of others - and than I'll fix patch. ok? > > Yeah, I'd like some more votes, too. Aside from what I suggested > (array_join/array_split), I think my favorite is your #5. > ok #5 - it is absolutely out of me - explode, implode are used in Czech only with relation to bombs. In this moment I have a problem to decide what is related to string_to_array and array_to_string - it is nothing against to your opinion, just it means, so it hasn't any meaning for me - and probably for lot of foreign developers. But I found on net, that people use this names. > We might also want to put some work into documentating the differences > between the old and new functions clearly. > sure Pavel > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company >
On Wed, Jul 21, 2010 at 2:28 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Yeah, I'd like some more votes, too. Aside from what I suggested > (array_join/array_split), I think my favorite is your #5. -1 for me for any name that is of the form of: type_operation(); we don't have bytea_encode, array_unnest(), date_to_char(), etc. the non-internal ones that we do have (mostly array funcs), are improperly named imo. this is sql, not c. suppose we want to extend string serialization to row types? why not serialize/unserialize? merlin
On Wed, Jul 21, 2010 at 02:38:17PM -0400, Merlin Moncure wrote: > On Wed, Jul 21, 2010 at 2:28 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > Yeah, I'd like some more votes, too. Aside from what I suggested > > (array_join/array_split), I think my favorite is your #5. > > -1 for me for any name that is of the form of: > type_operation(); > > we don't have bytea_encode, array_unnest(), date_to_char(), etc. the > non-internal ones that we do have (mostly array funcs), are improperly > named imo. this is sql, not c. suppose we want to extend string > serialization to row types? > > why not serialize/unserialize? Because that's not what the function actually does. FWIW, I'm for (im|ex)plode, as join()/split(), which I'd otherwise like, would run into too many issues with JOIN. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Robert Haas <robertmhaas@gmail.com> writes: >>>> so my preferences: >>>> >>>> 1. split, join - I checked - we are able to create "join" function >>>> 2. split, array_join - when only "join" can be a problem >>>> 3. string_split, array_join - there are not clean symmetry, but it >>>> respect wide used a semantics - string.split, array.join >>>> 4. explode, implode >>>> 5. array_explode, array_implode >>>> -- I cannot to like array_split - it is contradiction for me. > > Yeah, I'd like some more votes, too. I still don't see a compelling reason not to extend existing functions with a third argument. Or are we talking about deprecating them in the future (like remove their mention in the docs) and have the new names to replace them, with the new behavior as the default and the extended call convention as the old behavior? I'm not sure about that, so I think extending existing function is ok. Or we would have to have the new functions work well with other types too, so that it's compelling to move from the old ones. Regards, -- dim
2010/7/23 Dimitri Fontaine <dfontaine@hi-media.com>: > Robert Haas <robertmhaas@gmail.com> writes: >>>>> so my preferences: >>>>> >>>>> 1. split, join - I checked - we are able to create "join" function >>>>> 2. split, array_join - when only "join" can be a problem >>>>> 3. string_split, array_join - there are not clean symmetry, but it >>>>> respect wide used a semantics - string.split, array.join >>>>> 4. explode, implode >>>>> 5. array_explode, array_implode >>>>> -- I cannot to like array_split - it is contradiction for me. >> >> Yeah, I'd like some more votes, too. > > I still don't see a compelling reason not to extend existing functions > with a third argument. Or are we talking about deprecating them in the > future (like remove their mention in the docs) and have the new names to > replace them, with the new behavior as the default and the extended call > convention as the old behavior? just incomplete default behave :(. We can enhance old functions, but we cannot to change default behave - it is mean, so we will to ignore a NULLs in arrays forever - but it isn't true a three years. It is a feature now. Please look to archive. There was a discus about it. > > I'm not sure about that, so I think extending existing function is ok. > > Or we would have to have the new functions work well with other types > too, so that it's compelling to move from the old ones. I would not to replace or enhance a to_char function. I plan to use a "implode", "explode" names Regards Pavel Stehule > > Regards, > -- > dim >
Hello I am sending a actualised patch. There is only one significant change to last patch. Function to_string was renamed to "implode" and to_array was renamed "explode". Regards Pavel Stehule
Attachment
Hello Dimitry >> >> I still don't see a compelling reason not to extend existing functions >> with a third argument. Or are we talking about deprecating them in the >> future (like remove their mention in the docs) and have the new names to >> replace them, with the new behavior as the default and the extended call >> convention as the old behavior? > > just incomplete default behave :(. We can enhance old functions, but > we cannot to change default behave - it is mean, so we will to ignore > a NULLs in arrays forever - but it isn't true a three years. It is a > feature now. Please look to archive. There was a discus about it. > The reason, why I am strong in change of default behave is only one - I know only one use-case for curent behave - when array_to_string ignore a NULL, - when you would to remove NULLs from array, you can do string_to_array(array_to_string(x,'###'), '###') - I don't know other use-case. When I have a NULL in array, then it have a some sense there. And I can remove NULLs from array via more secure and faster way SELECT array(SELECT v FROM unnest(x) g(x) WHERE v IS NOT NULL) using string_to_array and array_to_string is slower and for some domains can be wrong (for text). Regards Pavel p.s. I expect so anybody who has NULLs in an array not only for a joy.
Pavel Stehule wrote: > 2010/7/21 Itagaki Takahiro <itagaki.takahiro@gmail.com>: > > 2010/7/20 Pavel Stehule <pavel.stehule@gmail.com>: > >> here is a new version - new these functions are not a strict and > >> function to_string is marked as stable. > > > > We have array_to_string(anyarray, text) and string_to_array(text, text), > > and you'll introduce to_string(anyarray, text, text) and > > to_array(text, text, text). > > I have to repeat it, the behave of this functions are little bit > different. string_to_array and array_to_string are buggy. > > * it isn't support a NULL > * it doesn't differentiate a empty array and NULL > * we cannot to change default behave of existing functions > * array_to_string is badly marked as IMMUTABLE This email thread linked to from our TODO list explains that arrays combined with NULLs have many inconsistenciess: http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Apparently, the message I sent (quoted below) didn't make it to -hackers. I know that Pavel received the message, as he replied to it. I'm calling shenanigans on the mailing list server, but in the meantime, here are those diffs again. On 31 July 2010 07:37, Brendan Jurd <direvus@gmail.com> wrote: > Hi Pavel, > > I've reviewed your latest patch (which I refer to as v3 to keep > continuity with previous versions under the "to_array" naming system). > > You didn't quite complete the rename of the functions; in-code > comments and regression tests still referred to the old names. I > cleanup that up for you and also reworded some of the in-code comments > for clarity. > > Otherwise the patch looks good and the functions now work exactly as I > would expect. > > I also went ahead and added some more documentation to explain how > (im|ex)plode differ from their foo_to_bar counterparts, and what kind > of behaviour you'll get by specifying the arguments as NULL. > > I have attached v4 of the patch against HEAD, and also an incremental > patch showing just my changes against v3. > > I'll mark this as ready for committer. > > Cheers, > BJ >
Attachment
Brendan Jurd <direvus@gmail.com> writes: >> I have attached v4 of the patch against HEAD, and also an incremental >> patch showing just my changes against v3. >> >> I'll mark this as ready for committer. Looking at this, I want to question the implode/explode naming. I think those names are too cute by half, not particularly mnemonic, not visibly related to the similar existing functions, and not friendly to any future extension in the same area. My first thought is that we should go back to the string_to_array and array_to_string names. The key reason not to use those names was the conflict with the old functions if you didn't specify a third argument, but where is the advantage of not specifying the third argument? It would be a lot simpler for people to understand if we just said "the two-argument forms work like this, while the three-argument forms work like that". This is especially reasonable because the difference in behavior is about nulls in the array, which is exactly what the third argument exists to specify. [ Sorry for not complaining about this before, but I was on vacation when the previous naming discussion went on. ] regards, tom lane
On Mon, Aug 9, 2010 at 4:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >>> I have attached v4 of the patch against HEAD, and also an incremental >>> patch showing just my changes against v3. >>> >>> I'll mark this as ready for committer. > > Looking at this, I want to question the implode/explode naming. I think > those names are too cute by half, not particularly mnemonic, not visibly > related to the similar existing functions, and not friendly to any > future extension in the same area. > > My first thought is that we should go back to the string_to_array and > array_to_string names. The key reason not to use those names was the > conflict with the old functions if you didn't specify a third argument, > but where is the advantage of not specifying the third argument? It > would be a lot simpler for people to understand if we just said "the > two-argument forms work like this, while the three-argument forms work > like that". This is especially reasonable because the difference in > behavior is about nulls in the array, which is exactly what the third > argument exists to specify. > > [ Sorry for not complaining about this before, but I was on vacation > when the previous naming discussion went on. ] I can live with that, as long as it's clearly explained in the docs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Mon, Aug 9, 2010 at 4:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >>> I have attached v4 of the patch against HEAD, and also an incremental >>> patch showing just my changes against v3. >>> >>> I'll mark this as ready for committer. > > Looking at this, I want to question the implode/explode naming. I think > those names are too cute by half, not particularly mnemonic, not visibly > related to the similar existing functions, and not friendly to any > future extension in the same area. > > My first thought is that we should go back to the string_to_array and > array_to_string names. The key reason not to use those names was the > conflict with the old functions if you didn't specify a third argument, > but where is the advantage of not specifying the third argument? It > would be a lot simpler for people to understand if we just said "the > two-argument forms work like this, while the three-argument forms work > like that". This is especially reasonable because the difference in > behavior is about nulls in the array, which is exactly what the third > argument exists to specify. Is there any reason why array functions need the type prefix when other type conversion functions don't? Why didn't we name unnest() array_unnest()? merlin
Merlin Moncure <mmoncure@gmail.com> writes: > Is there any reason why array functions need the type prefix when > other type conversion functions don't? Why didn't we name unnest() > array_unnest()? UNNEST() is in the standard, IIRC, so you'd have to ask the SQL committee that. (And no, they're not exactly being consistent either, see array_agg() for example.) But anyway, my point here is that these functions are close enough to the existing string_to_array/array_to_string functions that they should be presented as variants of those, not arbitrarily assigned unrelated new names. Whether we'd have chosen different names if we had it to do over is academic. regards, tom lane
On Aug 9, 2010, at 1:10 PM, Robert Haas wrote: >> My first thought is that we should go back to the string_to_array and >> array_to_string names. The key reason not to use those names was the >> conflict with the old functions if you didn't specify a third argument, >> but where is the advantage of not specifying the third argument? It >> would be a lot simpler for people to understand if we just said "the >> two-argument forms work like this, while the three-argument forms work >> like that". This is especially reasonable because the difference in >> behavior is about nulls in the array, which is exactly what the third >> argument exists to specify. >> >> [ Sorry for not complaining about this before, but I was on vacation >> when the previous naming discussion went on. ] > > I can live with that, as long as it's clearly explained in the docs. +1 David
On Mon, Aug 9, 2010 at 4:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> Is there any reason why array functions need the type prefix when >> other type conversion functions don't? Why didn't we name unnest() >> array_unnest()? > > UNNEST() is in the standard, IIRC, so you'd have to ask the SQL > committee that. (And no, they're not exactly being consistent either, > see array_agg() for example.) > > But anyway, my point here is that these functions are close enough to > the existing string_to_array/array_to_string functions that they should > be presented as variants of those, not arbitrarily assigned unrelated > new names. Whether we'd have chosen different names if we had it to do > over is academic. I don't array_agg is the same case, because you're aggregating into an array, not from one. all the same, +1 to your names (didn't like explode much). merlin
David E. Wheeler wrote: > On Aug 9, 2010, at 1:10 PM, Robert Haas wrote: > > >> My first thought is that we should go back to the string_to_array and > >> array_to_string names. The key reason not to use those names was the > >> conflict with the old functions if you didn't specify a third argument, > >> but where is the advantage of not specifying the third argument? It > >> would be a lot simpler for people to understand if we just said "the > >> two-argument forms work like this, while the three-argument forms work > >> like that". This is especially reasonable because the difference in > >> behavior is about nulls in the array, which is exactly what the third > >> argument exists to specify. > >> > >> [ Sorry for not complaining about this before, but I was on vacation > >> when the previous naming discussion went on. ] > > > > I can live with that, as long as it's clearly explained in the docs. > > +1 +1 -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Hello 2010/8/9 Tom Lane <tgl@sss.pgh.pa.us>: > Brendan Jurd <direvus@gmail.com> writes: >>> I have attached v4 of the patch against HEAD, and also an incremental >>> patch showing just my changes against v3. >>> >>> I'll mark this as ready for committer. > > Looking at this, I want to question the implode/explode naming. I think > those names are too cute by half, not particularly mnemonic, not visibly > related to the similar existing functions, and not friendly to any > future extension in the same area. > > My first thought is that we should go back to the string_to_array and > array_to_string names. The key reason not to use those names was the > conflict with the old functions if you didn't specify a third argument, > but where is the advantage of not specifying the third argument? It > would be a lot simpler for people to understand if we just said "the > two-argument forms work like this, while the three-argument forms work > like that". This is especially reasonable because the difference in > behavior is about nulls in the array, which is exactly what the third > argument exists to specify. > The name isn't important - I believe so you or Robert can choose the best name. Important is default behave. On an start is idea, so functions that lost some information isn't optimal - and it is array_to_string problem - because this function quietly skip NULL fields, if there are. So it was a motivation to write these functions. Regards Pavel Stehule > [ Sorry for not complaining about this before, but I was on vacation > when the previous naming discussion went on. ] > > regards, tom lane >
Brendan Jurd <direvus@gmail.com> writes: >> I have attached v4 of the patch against HEAD, and also an incremental >> patch showing just my changes against v3. >> >> I'll mark this as ready for committer. Applied, with the discussed changes and some code editing. regards, tom lane
2010/8/10 Tom Lane <tgl@sss.pgh.pa.us>: > Brendan Jurd <direvus@gmail.com> writes: >>> I have attached v4 of the patch against HEAD, and also an incremental >>> patch showing just my changes against v3. >>> >>> I'll mark this as ready for committer. > > Applied, with the discussed changes and some code editing. > > regards, tom lane > Thank you very much Regards Pavel Stehule