Thread: Case preserving - suggestions
Hi list, A postgresql migration I am doing (the same one for which the OLE DB driver was written) has finally passed the proof-of-concept stage (phew). I now have lots and lots of tidbits, tricks and tips for SQL Server migration, which I would love to put online. Is pgFoundry the right place? I understand that the code snippets section is not yet operative, but I would still love to put it online ASAP (i.e. - before I forget), and to have it all in one place. One problem detected during that stage, however, was that the program pretty much relies on the collation being case insensitive. I am now trying to gather the info regarding adding case preserving to Postgresql. I already suggested that we do that by changing the procedures, and the idea was turned down. For example, a column UNIQUE constraint must enforce that only one instance of a string be present, case insensitive. Then again, making everything lower/upper case before putting it in was also rejected. Case preserving is what we are looking for. Now, one idea that floated through my mind, and I have not yet looked into how difficult it would be to implement was to define a new system wide collation, called, for example, en_USCI. Have that collation define 'a' and 'A' as "the same character". I'm looking for someone with more experience with these things than me (i.e. - just about anyone) to say whether such a thing is doable. I know I can reorder sort criteria using collation, but can I make two characters be actually the same? As a side note, I'll mention that MsSQL uses the collation field to define case insensitivity. Assuming that fails, how hard would it be to create a case insensitive PostgreSQL? Would that be more like changing a couple of places (say, hash computation and string compares), or would that entail making hundreds of little changes all over the code? Is there anything in the regression testing infrastructure that can help check such a change? Many thanks, Shachar -- Shachar Shemesh Lingnu Open Source Consulting http://www.lingnu.com/
Shachar Shemesh <psql@shemesh.biz> writes: > One problem detected during that stage, however, was that the program > pretty much relies on the collation being case insensitive. I am now > trying to gather the info regarding adding case preserving to > Postgresql. I already suggested that we do that by changing the > procedures, and the idea was turned down. For example, a column UNIQUE > constraint must enforce that only one instance of a string be present, > case insensitive. Is there a reason you can't just docreate unique index on mytable (upper(mycolumn)) ? > Now, one idea that floated through my mind, and I have not yet looked > into how difficult it would be to implement was to define a new system > wide collation, called, for example, en_USCI. I don't know anything about the difficulty of this either. I suspect that it would be a bad idea to create a collation in which strcoll could claim that nonidentical strings are equal --- you would likely find that surprising things break. You could try it though and see what happens. > Assuming that fails, how hard would it be to create a case insensitive > PostgreSQL? It would be a serious mistake to try to make a "case insensitive PostgreSQL", if by that you mean hacking the system structure to enforce case insensitivity globally. You could reasonably think about inventing a case-insensitive textual datatype, however. (In fact, I'd not be surprised if someone did that already.) All you'd really need is to write a set of case-insensitive comparison operators and define a btree opclass with them. Everything else you might want to do could be handled by making the type binary-equivalent to text and piggybacking on the existing functions for text. regards, tom lane
On Sunday 06 June 2004 13:47, Shachar Shemesh wrote: > Hi list, > > A postgresql migration I am doing (the same one for which the OLE DB > driver was written) has finally passed the proof-of-concept stage > (phew). I now have lots and lots of tidbits, tricks and tips for SQL > Server migration, which I would love to put online. Is pgFoundry the > right place? I understand that the code snippets section is not yet > operative, but I would still love to put it online ASAP (i.e. - before I > forget), and to have it all in one place. > We have a couple of articles about sql server migration up on techdocs now, so that seems the most appropriate place for documenting the process you went through. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Robert Treat wrote: >On Sunday 06 June 2004 13:47, Shachar Shemesh wrote: > > >>Hi list, >> >>A postgresql migration I am doing (the same one for which the OLE DB >>driver was written) has finally passed the proof-of-concept stage >>(phew). I now have lots and lots of tidbits, tricks and tips for SQL >>Server migration, which I would love to put online. Is pgFoundry the >>right place? I understand that the code snippets section is not yet >>operative, but I would still love to put it online ASAP (i.e. - before I >>forget), and to have it all in one place. >> >> >> > >We have a couple of articles about sql server migration up on techdocs now, so >that seems the most appropriate place for documenting the process you went >through. > >Robert Treat > > We wrote a new data type (a tinyint replacement), that will not be part of postgresql any time soon. We need to put the code up somewhere. We wrote a perl script that takes the output of the database dump done by MS-SQL, and converts it into an SQL script for postgres (including yanking the data from the MS-SQL tables, and inlining it into the Postgresql script using copy). That needs to go somewhere. It is highly likely that people will find bugs, or want to make additions, to the above two, so source control seems like a necessary idea. In short, I think a pgFoundry project seems better suited for our needs than static documentation. Shachar -- Shachar Shemesh Lingnu Open Source Consulting http://www.lingnu.com/
Shachar Shemesh wrote: > Robert Treat wrote: > >> On Sunday 06 June 2004 13:47, Shachar Shemesh wrote: >> >> >>> Hi list, >>> >>> A postgresql migration I am doing (the same one for which the OLE DB >>> driver was written) has finally passed the proof-of-concept stage >>> (phew). I now have lots and lots of tidbits, tricks and tips for SQL >>> Server migration, which I would love to put online. Is pgFoundry the >>> right place? I understand that the code snippets section is not yet >>> operative, but I would still love to put it online ASAP (i.e. - >>> before I >>> forget), and to have it all in one place. >>> >>> >> >> >> We have a couple of articles about sql server migration up on >> techdocs now, so that seems the most appropriate place for >> documenting the process you went through. >> Robert Treat >> >> > We wrote a new data type (a tinyint replacement), that will not be > part of postgresql any time soon. We need to put the code up somewhere. > We wrote a perl script that takes the output of the database dump done > by MS-SQL, and converts it into an SQL script for postgres (including > yanking the data from the MS-SQL tables, and inlining it into the > Postgresql script using copy). That needs to go somewhere. > It is highly likely that people will find bugs, or want to make > additions, to the above two, so source control seems like a necessary > idea. > In short, I think a pgFoundry project seems better suited for our > needs than static documentation. > > Shachar > Feel free to apply for a pgFoundry project - most are approved very quickly. cheers andrew