Refined LC_COLLATE or multiple database clusters? - Mailing list pgsql-admin

From Grega Bremec
Subject Refined LC_COLLATE or multiple database clusters?
Date
Msg-id 20040609103303.GA4016@elbereth.noviforum.si
Whole thread Raw
Responses Re: Refined LC_COLLATE or multiple database clusters?
Re: [HACKERS] Refined LC_COLLATE or multiple database clusters?
List pgsql-admin
Hello, List,

I recently stumbled across a problem that I can't really get across.

We have a database cluster (PG 7.4.2) that was initialized as follows:

    $ pg_controldata /data/dir
    pg_control version number:            72
    Catalog version number:               200310211
    Database cluster state:               in production
    pg_control last modified:             sre 09 jun 2004 03:00:26 CEST
    Current log file ID:                  20
    Next log file segment:                63
    ... <snip irrelevant checkpoint info> ...
    Database block size:                  8192
    Blocks per segment of large relation: 131072
    Maximum length of identifiers:        64
    Maximum number of function arguments: 32
    Date/time type storage:               64-bit integers
    Maximum length of locale name:        128
    LC_COLLATE:                           C
    LC_CTYPE:                             C

Inside that cluster, there are several databases using different encodings:

    template1=# \l
         List of databases
    Name     |  Owner   | Encoding
    -------------+----------+----------
     db1         | ownera   | UNICODE
     db2         | ownera   | UNICODE
     db3         | ownerb   | LATIN2
     db4         | ownerc   | LATIN2
     db5         | ownera   | LATIN2
     db6         | ownera   | LATIN2
     template0   | postgres | UNICODE
     template1   | postgres | UNICODE

Collate order for those databases, however, needs to be different. Obviously,
db3, db4, db5 and db6 will want to use some collate ordering scheme based on
ISO-8859-2, whereas the other two could possibly have table- or even column-
based collate requirements, as they contain unicode data in UTF-8 encoding,
which doesn't give any warranties wrt the contents of these databases.

Producing a list of requirements and/or imposing certain conventions on the
format of data stored in those tables is outside the scope of my authorities,
the only reasonable assumption I can make is that these databases could be
limited to one collating order per database (or several databases, as it is)
without much hassle.

Also, running several postmasters on this same machine is not an option, as
it only has 1.5GB RAM, of which only 1GB is available for cache (~260MB is
swapped after roughly a month's uptime, but that doesn't change much after
settling down in a week or two).

My question to the list would be the following:

Is it possible to do either of these things that could solve this problem
adequately:

    - somehow manage to make one postmaster run on top of two separate
      database clusters that would each have a different collate ordering
      scheme

    - use some other method of initializing one database from a different
      template and taking with it LC_COLLATE setting (I suppose not, as
      the "${PGDATA}/global/" directory is global to the cluster)

    - use a patch that would add such functionality or upgrade to a version
      (even if release-candidate, beta is not really an option, i gather)
      of PostgreSQL that supported it

    - in absence of any other viable solution, change the global setting of
      the database cluster without having to dump/reinitdb/restore it

I thank you in advance for your valuable input,
--
    Grega Bremec
    Senior Administrator
    Noviforum Ltd., Software & Media
    http://www.noviforum.si/

pgsql-admin by date:

Previous
From: "lise chhay"
Date:
Subject: Is the server running locally and accepting connections on Unix domain socket /tmp/.s.PGSQL.5432 ?
Next
From: "Scott Marlowe"
Date:
Subject: Re: [Fwd: Re: RHDB just sits and does nothing?]