Thread: Patch for database locale settings
Hi, this patch allow to use database locale settings. It remove cluster locale settings, and append LCCTYPE and LCCOLLATE items to CREATE DATABASE syntax. Any considerations ? best regards, Alexey Slynko
Attachment
On Mon, Mar 28, 2005 at 12:16:42PM +0400, Alexey Slynko wrote: > this patch allow to use database locale settings. It remove cluster locale > settings, and append LCCTYPE and > LCCOLLATE items to CREATE DATABASE syntax. > > Any considerations ? The problem with this is what happens to indexes on shared relations. If you change the collation, they are no longer valid. And you can't just reindex them, because then they would no longer be valid for other databases. Maybe it would work if we forced indexes on shared relations to be scanned using a fixed collation. Not sure about the ctype part ... -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "Los románticos son seres que mueren de deseos de vida"
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > Maybe it would work if we forced indexes on shared relations to be > scanned using a fixed collation. The shared relations only have indexes on name, oid, and integer: select distinct atttypid::regtype from pg_class c join pg_attribute a on c.oid = a.attrelid where relisshared and relkind = 'i'; and name has non-locale-sensitive ordering rules anyway. So that's not the big problem; we could probably get away with decreeing that name will always be that way and that shared relations can't have locale-dependent indexes. The big problem (and the reason why this idea has been shot down in the past) is that CREATE DATABASE can't change the locale from what it is in the template database unless it's prepared to reindex any locale- sensitive indexes in the non-shared part of the template database. Which would be a difficult undertaking seeing that we can't even connect to the copied database until after commit. We could maybe say that we will never have any locale-dependent indexes at all on any system catalog, but what of user-defined tables in template databases? It would simply not work to do something as simple as creating a table with an indexed text column in template1. On the other hand you could argue that people already run the same kind of risk when changing database encoding at CREATE, which is a feature that's been there a long time and hasn't garnered many complaints. Not so much that their indexes will break as that their data will. So perhaps we should be willing to document "don't do that". Certainly it would be a lot more useful if both locale and encoding could be set per-database. regards, tom lane
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1348485674-1112025858=:22875 Content-Type: TEXT/PLAIN; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 8BIT > On Mon, Mar 28, 2005 at 12:16:42PM +0400, Alexey Slynko wrote: > > > this patch allow to use database locale settings. It remove cluster locale > > settings, and append LCCTYPE and > > LCCOLLATE items to CREATE DATABASE syntax. > > > > Any considerations ? > > The problem with this is what happens to indexes on shared relations. > If you change the collation, they are no longer valid. And you can't > just reindex them, because then they would no longer be valid for other > databases. I havn't see any shared indices, that uses any variable collation, like text indices. All shared indexed columns have integer and name type. As I understand, all this types have fixed collation. Or I've missed something ? > > Maybe it would work if we forced indexes on shared relations to be > scanned using a fixed collation. > > Not sure about the ctype part ... > > -- > Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) > "Los romАnticos son seres que mueren de deseos de vida" > Alexey Slynko E-mail: slynko@tronet.ru --0-1348485674-1112025858=:22875--
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes: >> Maybe it would work if we forced indexes on shared relations to be >> scanned using a fixed collation. > > The shared relations only have indexes on name, oid, and integer: > select distinct atttypid::regtype > from pg_class c join pg_attribute a on c.oid = a.attrelid > where relisshared and relkind = 'i'; > and name has non-locale-sensitive ordering rules anyway. So that's > not the big problem; we could probably get away with decreeing that > name will always be that way and that shared relations can't have > locale-dependent indexes. > > The big problem (and the reason why this idea has been shot down in > the past) is that CREATE DATABASE can't change the locale from what it > is in the template database unless it's prepared to reindex any locale- > sensitive indexes in the non-shared part of the template database. > Which would be a difficult undertaking seeing that we can't even connect > to the copied database until after commit. > We could maybe say that we will never have any locale-dependent indexes > at all on any system catalog, but what of user-defined tables in > template databases? It would simply not work to do something as simple > as creating a table with an indexed text column in template1. There is another way to broke indexes, like specify another user-defined template database in CREATE DATABASE. I think, that we can add new parameter, like LOCALEDEPEND, in CREATE TYPE syntax. It allows to separate locale-dependent indexes and reindex them. But I havn't yet any idea, how we can reindex database immediately after creation. Any suggestions ? > > On the other hand you could argue that people already run the same kind > of risk when changing database encoding at CREATE, which is a feature > that's been there a long time and hasn't garnered many complaints. > Not so much that their indexes will break as that their data will. > So perhaps we should be willing to document "don't do that". Certainly > it would be a lot more useful if both locale and encoding could be set > per-database. > > regards, tom lane > > Alexey Slynko E-mail: slynko@tronet.ru
On Mon, Mar 28, 2005 at 10:54:16AM -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > Maybe it would work if we forced indexes on shared relations to be > > scanned using a fixed collation. > > The shared relations only have indexes on name, oid, and integer: > select distinct atttypid::regtype > from pg_class c join pg_attribute a on c.oid = a.attrelid > where relisshared and relkind = 'i'; > and name has non-locale-sensitive ordering rules anyway. So that's > not the big problem; we could probably get away with decreeing that > name will always be that way and that shared relations can't have > locale-dependent indexes. This is good news. > The big problem (and the reason why this idea has been shot down in > the past) is that CREATE DATABASE can't change the locale from what it > is in the template database unless it's prepared to reindex any locale- > sensitive indexes in the non-shared part of the template database. > Which would be a difficult undertaking seeing that we can't even connect > to the copied database until after commit. I don't see how this is a showstopper. At creation time we may decree that the database is "incomplete", and users can't normally connect to it; we only allow that after a phase of correcting minor issues, such as reindexing if necessary. I recall Fabien Coelho wanted to do something on this area too. We could do this with a new column (say) datready in pg_database, set to false by CreateDb(); at connection time this could be checked. The downside I see to this is that the bit has to be checked at each connection start; not sure if this is exceedingly onerous. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) Licensee shall have no right to use the Licensed Software for productive or commercial use. (Licencia de StarOffice 6.0 beta)
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > I don't see how this is a showstopper. At creation time we may decree > that the database is "incomplete", and users can't normally connect to > it; we only allow that after a phase of correcting minor issues, such as > reindexing if necessary. I recall Fabien Coelho wanted to do something > on this area too. I didn't like his patch then, and I don't like it now ;-). It strikes me as a mighty dirty solution to the problem. regards, tom lane
Description added to TODO: * Allow locale to be set at database creation Currently locale can only be set during initdb. No global tables have locale-aware columns. However, the database template used during database creation might have locale-aware indexes. The indexes would need to be reindexed to match the new locale. --------------------------------------------------------------------------- Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > Maybe it would work if we forced indexes on shared relations to be > > scanned using a fixed collation. > > The shared relations only have indexes on name, oid, and integer: > select distinct atttypid::regtype > from pg_class c join pg_attribute a on c.oid = a.attrelid > where relisshared and relkind = 'i'; > and name has non-locale-sensitive ordering rules anyway. So that's > not the big problem; we could probably get away with decreeing that > name will always be that way and that shared relations can't have > locale-dependent indexes. > > The big problem (and the reason why this idea has been shot down in > the past) is that CREATE DATABASE can't change the locale from what it > is in the template database unless it's prepared to reindex any locale- > sensitive indexes in the non-shared part of the template database. > Which would be a difficult undertaking seeing that we can't even connect > to the copied database until after commit. > > We could maybe say that we will never have any locale-dependent indexes > at all on any system catalog, but what of user-defined tables in > template databases? It would simply not work to do something as simple > as creating a table with an indexed text column in template1. > > On the other hand you could argue that people already run the same kind > of risk when changing database encoding at CREATE, which is a feature > that's been there a long time and hasn't garnered many complaints. > Not so much that their indexes will break as that their data will. > So perhaps we should be willing to document "don't do that". Certainly > it would be a lot more useful if both locale and encoding could be set > per-database. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073