Thread: Blank-padding (was: Oracle buys Innobase)
On 2005-10-21 09:47, Tom Lane wrote: > Alex Turner <armtuk@gmail.com> writes: > >> It appears that casting to a char() causes spaces to be stripped (ignored) from the string: >> > mls=# select length('123 '::char(8)); > length > -------- > 3 > (1 row) > > >> I'm not sure about anyone else, but I would personaly consider that a bug? >> > > No, it's a feature, as per extensive discussion some time ago when we made it do that. The general rule is that trailingspaces in a char(n) are semantically insignificant. > > regards, tom lane > I remember that discussion, and I was for the change. However, upon doing some testing after reading the above, I wonder if the blank-stripping isn't too aggressive. I have a CHAR(6) field (say, named Z) that has "abc " in it. Suppose I want to append "x" to Z, with any leading spaces in Z PRESERVED. The following do not work in 8.0.4: select Z || 'x'; select Z::char(6) || 'x'; select Z::varchar(6) || 'x'; select (Z || ' ')::char(6) || 'x'; There are only two ways I've found: select rpad( Z, 6) || 'x'; -- but "rpad" is apparently not a SQL-standard function. select cast (Z || ' ' as varchar(6)) || 'x'; -- hokey but SQL-compliant Is there something I'm missing??? -- Dean
"Dean Gibson (DB Administrator)" <postgresql4@ultimeth.com> writes: > I remember that discussion, and I was for the change. However, upon > doing some testing after reading the above, I wonder if the > blank-stripping isn't too aggressive. I have a CHAR(6) field (say, > named Z) that has "abc " in it. Suppose I want to append "x" to Z, > with any leading spaces in Z PRESERVED. (You meant trailing spaces, I assume.) Why exactly would you want to do that? You decided by your choice of datatype that the trailing spaces weren't significant. This gripe seems to me exactly comparable to complaining if a numeric datatype doesn't remember how many trailing zeroes you typed after the decimal point. Those zeroes aren't semantically significant, so you have no case. regards, tom lane
Tom Lane wrote: >"Dean Gibson (DB Administrator)" <postgresql4@ultimeth.com> writes: > > >>I remember that discussion, and I was for the change. However, upon >>doing some testing after reading the above, I wonder if the >>blank-stripping isn't too aggressive. I have a CHAR(6) field (say, >>named Z) that has "abc " in it. Suppose I want to append "x" to Z, >>with any leading spaces in Z PRESERVED. >> >> > >(You meant trailing spaces, I assume.) Why exactly would you want to do >that? You decided by your choice of datatype that the trailing spaces >weren't significant. > I once built a telecom billing app where this might be important (fixed length fields). Lets say you have fixed length fields defined as char(n) datatypes. You may want to build a query to generate billing records like: select field1 || field2 || field3 || field4 || field5 ... AS bill_record FROM lec_billing_entries; It seels to me that I would expect trailing spaces to be preserved in these cases. Having an implicit rtrim function is asking for problems. Personally I would rather have to call rtrim explicitly than have the backend treat the concatenation differently than if I do it on the client. > This gripe seems to me exactly comparable to >complaining if a numeric datatype doesn't remember how many trailing >zeroes you typed after the decimal point. Those zeroes aren't >semantically significant, so you have no case.\ > > My only gripe here is that the implicit rtrimming is going to cause problems in cases where you are trying to do things with fixed-length fields, which is really where one is likely to use bpchar anyway. It is not a showstopper, but I can see why some people don't like it. But can't please everyone :-) Best Wishes, Chris Travers Metatron Technology Consulting
On 2005-10-21 20:23, Tom Lane wrote: > "Dean Gibson (DB Administrator)" <postgresql4@ultimeth.com> writes: > >> I remember that discussion, and I was for the change. However, upon >> doing some testing after reading the above, I wonder if the >> blank-stripping isn't too aggressive. I have a CHAR(6) field (say, >> named Z) that has "abc " in it. Suppose I want to append "x" to Z, >> with any leading (oops, trailing) spaces in Z PRESERVED. >> > > (You meant trailing spaces, I assume.) Why exactly would you want to do that? You decided by your choice of datatypethat the trailing spaces weren't significant. This gripe seems to me exactly comparable to complaining if a numericdatatype doesn't remember how many trailing zeroes you typed after the decimal point. Those zeroes aren't semanticallysignificant, so you have no case. > > regards, tom lane > It's one thing, as with floating point values, where the internal representation is identical (1.0 stored identically to 1.00), and another thing where the internal representation is different (a la strings storing 'abc ' differently from 'abc'. While programming languages may differ as to how they compare those two strings, every one that I'm familiar with does not strip trailing spaces when concatenating. It's not an issue with me, but the 8.0 concatenation does violate the "Principle of Least Astonishment". As does treating 'abc ' and 'abc' different during comparisons (which 8.0 fixed). So my tentative argument would be, strip trailing spaces on CHAR(n) comparisons but not on concatenation or assignment, because you lost information. You can always strip if you need to (as you do in other languages). As I said, this issue doesn't presently affect my databases, it's just an observation. -- Dean
On Sat, 2005-10-22 at 00:39, Chris Travers wrote: > Tom Lane wrote: > > >"Dean Gibson (DB Administrator)" <postgresql4@ultimeth.com> writes: > > > > > >>I remember that discussion, and I was for the change. However, upon > >>doing some testing after reading the above, I wonder if the > >>blank-stripping isn't too aggressive. I have a CHAR(6) field (say, > >>named Z) that has "abc " in it. Suppose I want to append "x" to Z, > >>with any leading spaces in Z PRESERVED. > >> > >> > > > >(You meant trailing spaces, I assume.) Why exactly would you want to do > >that? You decided by your choice of datatype that the trailing spaces > >weren't significant. > > > I once built a telecom billing app where this might be important (fixed > length fields). Lets say you have fixed length fields defined as > char(n) datatypes. You may want to build a query to generate billing > records like: > select field1 || field2 || field3 || field4 || field5 ... AS bill_record > FROM lec_billing_entries; > > It seels to me that I would expect trailing spaces to be preserved in > these cases. Having an implicit rtrim function is asking for problems. > Personally I would rather have to call rtrim explicitly than have the > backend treat the concatenation differently than if I do it on the client. If I rememberate correctificantly, this problem is or was caused by the || operator using the same internal logic for varchar / text AND char. Tom, is it reasonable / doable to have || have a different set of internal functions for each of those types.