Thread: Patch for database locale settings

Patch for database locale settings

From
Alexey Slynko
Date:
Hi,

this patch allow to use database locale settings. It remove cluster locale
settings, and append LCCTYPE and
LCCOLLATE items to  CREATE DATABASE syntax.

Any considerations ?

best regards,
                 Alexey Slynko

Attachment

Re: Patch for database locale settings

From
Alvaro Herrera
Date:
On Mon, Mar 28, 2005 at 12:16:42PM +0400, Alexey Slynko wrote:

> this patch allow to use database locale settings. It remove cluster locale
> settings, and append LCCTYPE and
> LCCOLLATE items to  CREATE DATABASE syntax.
>
> Any considerations ?

The problem with this is what happens to indexes on shared relations.
If you change the collation, they are no longer valid.  And you can't
just reindex them, because then they would no longer be valid for other
databases.

Maybe it would work if we forced indexes on shared relations to be
scanned using a fixed collation.

Not sure about the ctype part ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Los románticos son seres que mueren de deseos de vida"

Re: Patch for database locale settings

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> Maybe it would work if we forced indexes on shared relations to be
> scanned using a fixed collation.

The shared relations only have indexes on name, oid, and integer:
select distinct atttypid::regtype
from pg_class c join pg_attribute a on c.oid = a.attrelid
where relisshared and relkind = 'i';
and name has non-locale-sensitive ordering rules anyway.  So that's
not the big problem; we could probably get away with decreeing that
name will always be that way and that shared relations can't have
locale-dependent indexes.

The big problem (and the reason why this idea has been shot down in
the past) is that CREATE DATABASE can't change the locale from what it
is in the template database unless it's prepared to reindex any locale-
sensitive indexes in the non-shared part of the template database.
Which would be a difficult undertaking seeing that we can't even connect
to the copied database until after commit.

We could maybe say that we will never have any locale-dependent indexes
at all on any system catalog, but what of user-defined tables in
template databases?  It would simply not work to do something as simple
as creating a table with an indexed text column in template1.

On the other hand you could argue that people already run the same kind
of risk when changing database encoding at CREATE, which is a feature
that's been there a long time and hasn't garnered many complaints.
Not so much that their indexes will break as that their data will.
So perhaps we should be willing to document "don't do that".  Certainly
it would be a lot more useful if both locale and encoding could be set
per-database.

            regards, tom lane

Re: Patch for database locale settings

From
Alexey Slynko
Date:
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1348485674-1112025858=:22875
Content-Type: TEXT/PLAIN; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 8BIT

> On Mon, Mar 28, 2005 at 12:16:42PM +0400, Alexey Slynko wrote:
>
> > this patch allow to use database locale settings. It remove cluster locale
> > settings, and append LCCTYPE and
> > LCCOLLATE items to  CREATE DATABASE syntax.
> >
> > Any considerations ?
>
> The problem with this is what happens to indexes on shared relations.
> If you change the collation, they are no longer valid.  And you can't
> just reindex them, because then they would no longer be valid for other
> databases.
I havn't see any shared indices, that uses any variable collation, like text
indices. All shared indexed columns have integer and name type. As I understand,
all this types have fixed collation. Or I've missed something ?

>
> Maybe it would work if we forced indexes on shared relations to be
> scanned using a fixed collation.
>
> Not sure about the ctype part ...
>
> --
> Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
> "Los romАnticos son seres que mueren de deseos de vida"
>

                 Alexey Slynko
                 E-mail: slynko@tronet.ru

--0-1348485674-1112025858=:22875--

Re: Patch for database locale settings

From
Alexey Slynko
Date:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
>> Maybe it would work if we forced indexes on shared relations to be
>> scanned using a fixed collation.
>
> The shared relations only have indexes on name, oid, and integer:
> select distinct atttypid::regtype
> from pg_class c join pg_attribute a on c.oid = a.attrelid
> where relisshared and relkind = 'i';
> and name has non-locale-sensitive ordering rules anyway.  So that's
> not the big problem; we could probably get away with decreeing that
> name will always be that way and that shared relations can't have
> locale-dependent indexes.
>
> The big problem (and the reason why this idea has been shot down in
> the past) is that CREATE DATABASE can't change the locale from what it
> is in the template database unless it's prepared to reindex any locale-
> sensitive indexes in the non-shared part of the template database.
> Which would be a difficult undertaking seeing that we can't even connect
> to the copied database until after commit.

> We could maybe say that we will never have any locale-dependent indexes
> at all on any system catalog, but what of user-defined tables in
> template databases?  It would simply not work to do something as simple
> as creating a table with an indexed text column in template1.
There is another way to broke indexes, like specify another user-defined
template database in CREATE DATABASE.

I think, that we can add new parameter, like LOCALEDEPEND, in CREATE TYPE
syntax. It allows to separate locale-dependent indexes and reindex them. But I
havn't yet any idea, how we can reindex database immediately after creation. Any
suggestions ?

>
> On the other hand you could argue that people already run the same kind
> of risk when changing database encoding at CREATE, which is a feature
> that's been there a long time and hasn't garnered many complaints.
> Not so much that their indexes will break as that their data will.
> So perhaps we should be willing to document "don't do that".  Certainly
> it would be a lot more useful if both locale and encoding could be set
> per-database.
>
>             regards, tom lane
>
>

                 Alexey Slynko
                 E-mail: slynko@tronet.ru


Re: Patch for database locale settings

From
Alvaro Herrera
Date:
On Mon, Mar 28, 2005 at 10:54:16AM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > Maybe it would work if we forced indexes on shared relations to be
> > scanned using a fixed collation.
>
> The shared relations only have indexes on name, oid, and integer:
> select distinct atttypid::regtype
> from pg_class c join pg_attribute a on c.oid = a.attrelid
> where relisshared and relkind = 'i';
> and name has non-locale-sensitive ordering rules anyway.  So that's
> not the big problem; we could probably get away with decreeing that
> name will always be that way and that shared relations can't have
> locale-dependent indexes.

This is good news.

> The big problem (and the reason why this idea has been shot down in
> the past) is that CREATE DATABASE can't change the locale from what it
> is in the template database unless it's prepared to reindex any locale-
> sensitive indexes in the non-shared part of the template database.
> Which would be a difficult undertaking seeing that we can't even connect
> to the copied database until after commit.

I don't see how this is a showstopper.  At creation time we may decree
that the database is "incomplete", and users can't normally connect to
it; we only allow that after a phase of correcting minor issues, such as
reindexing if necessary.  I recall Fabien Coelho wanted to do something
on this area too.

We could do this with a new column (say) datready in pg_database, set to
false by CreateDb(); at connection time this could be checked.  The
downside I see to this is that the bit has to be checked at each
connection start; not sure if this is exceedingly onerous.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Licensee shall have no right to use the Licensed Software
for productive or commercial use. (Licencia de StarOffice 6.0 beta)

Re: Patch for database locale settings

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> I don't see how this is a showstopper.  At creation time we may decree
> that the database is "incomplete", and users can't normally connect to
> it; we only allow that after a phase of correcting minor issues, such as
> reindexing if necessary.  I recall Fabien Coelho wanted to do something
> on this area too.

I didn't like his patch then, and I don't like it now ;-).  It strikes
me as a mighty dirty solution to the problem.

            regards, tom lane

Re: [HACKERS] Patch for database locale settings

From
Bruce Momjian
Date:
Description added to TODO:

    * Allow locale to be set at database creation

      Currently locale can only be set during initdb.  No global tables have
      locale-aware columns.  However, the database template used during
      database creation might have locale-aware indexes.  The indexes would
      need to be reindexed to match the new locale.

---------------------------------------------------------------------------

Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > Maybe it would work if we forced indexes on shared relations to be
> > scanned using a fixed collation.
>
> The shared relations only have indexes on name, oid, and integer:
> select distinct atttypid::regtype
> from pg_class c join pg_attribute a on c.oid = a.attrelid
> where relisshared and relkind = 'i';
> and name has non-locale-sensitive ordering rules anyway.  So that's
> not the big problem; we could probably get away with decreeing that
> name will always be that way and that shared relations can't have
> locale-dependent indexes.
>
> The big problem (and the reason why this idea has been shot down in
> the past) is that CREATE DATABASE can't change the locale from what it
> is in the template database unless it's prepared to reindex any locale-
> sensitive indexes in the non-shared part of the template database.
> Which would be a difficult undertaking seeing that we can't even connect
> to the copied database until after commit.
>
> We could maybe say that we will never have any locale-dependent indexes
> at all on any system catalog, but what of user-defined tables in
> template databases?  It would simply not work to do something as simple
> as creating a table with an indexed text column in template1.
>
> On the other hand you could argue that people already run the same kind
> of risk when changing database encoding at CREATE, which is a feature
> that's been there a long time and hasn't garnered many complaints.
> Not so much that their indexes will break as that their data will.
> So perhaps we should be willing to document "don't do that".  Certainly
> it would be a lot more useful if both locale and encoding could be set
> per-database.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073