Thread: Show encoding in initdb messages

Show encoding in initdb messages

From
Christopher Kings-Lynne
Date:
Does this:

The files belonging to this database system will be owned by user "chriskl".
This user must also own the server process.

The database cluster will be initialized with locale C.

The database cluster will be initialized with default encoding UNICODE.

creating directory /home/chriskl/local/data ... ok
creating directory /home/chriskl/local/data/global ... ok
....

This should save a lot of support requests, hopefully.

Chris


Attachment

Re: Show encoding in initdb messages

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> The database cluster will be initialized with locale C.
> The database cluster will be initialized with default encoding UNICODE.

> This should save a lot of support requests, hopefully.

I kinda doubt it will save any :-(.  In what situation would this not
merely be echoing back what the guy had just specifically typed on the
command line?

If we knew how to detect that the encoding doesn't work with the
selected locale value, then we could get somewhere ...

            regards, tom lane

Re: Show encoding in initdb messages

From
Christopher Kings-Lynne
Date:
>This should save a lot of support requests, hopefully.
>
>
> I kinda doubt it will save any :-(.  In what situation would this not
> merely be echoing back what the guy had just specifically typed on the
> command line?

When no -E argument is supplied at all, or when they type ISO-8859-1
instead of LATIN1.

The reason it will help with support is because newbies will go
"SQL_ASCII! I don't want ascii!".  Then they will kill initdb, check the
man page and then re-run it with a real encoding.

Seriously NO new postgres user checks the initdb man page before running
is the first time.  I know this from the irc channel...

Either way, I see no reason _not_ to just do it...

Chris


Re: Show encoding in initdb messages

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>>> This should save a lot of support requests, hopefully.
>>
>> I kinda doubt it will save any :-(.  In what situation would this not
>> merely be echoing back what the guy had just specifically typed on the
>> command line?

> When no -E argument is supplied at all, or when they type ISO-8859-1
> instead of LATIN1.

When no -E is supplied, we always default to SQL_ASCII; there's no
possibility of adopting a value from the environment.  The reason that
the locale printout exists is that the command line doesn't completely
specify what locale will be used.  That reason doesn't apply to encoding.

> The reason it will help with support is because newbies will go
> "SQL_ASCII! I don't want ascii!".

No they won't.  They will likely not even notice this message in the sea
of other messages they've never seen before; and even if they do notice
it, they will certainly not realize that they don't want it.  If the
message were to *say* "this is probably a bad choice because it's
incompatible with your locale selection", then it might possibly have
the effect you're hoping for, but I don't see how we can find that out.

> Either way, I see no reason _not_ to just do it...

We could make initdb print out every other setting it has too, but that
would not improve its user interface.  Adding messages that don't carry
useful content just debases the importance of each one.

            regards, tom lane

Re: Show encoding in initdb messages

From
"Magnus Hagander"
Date:
> > The reason it will help with support is because newbies will go
> > "SQL_ASCII! I don't want ascii!".
>
> No they won't.  They will likely not even notice this message
> in the sea of other messages they've never seen before; and
> even if they do notice it, they will certainly not realize
> that they don't want it.  If the message were to *say* "this
> is probably a bad choice because it's incompatible with your
> locale selection", then it might possibly have the effect
> you're hoping for, but I don't see how we can find that out.

Another way would be to make -E a mandatory parameter, and remove the
default completely. That way you have to make a conscious decision.
Can't claim surprise then.


//Magnus

Re: Show encoding in initdb messages

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>>> The reason it will help with support is because newbies will go
>>> "SQL_ASCII! I don't want ascii!".
>>
>> No they won't.  They will likely not even notice this message
>> in the sea of other messages they've never seen before; and
>> even if they do notice it, they will certainly not realize
>> that they don't want it.

> Another way would be to make -E a mandatory parameter, and remove the
> default completely. That way you have to make a conscious decision.
> Can't claim surprise then.

I don't think surprise is the problem; I think the problem is knowing
what setting will produce the result you want.  Newbies are
fundamentally unlikely to have this knowledge :-(

What I personally wish we could do is eliminate database encoding as
a separate setting altogether, and drive it off the locale selection.
I don't know how to do that though.

            regards, tom lane

Re: Show encoding in initdb messages

From
Peter Eisentraut
Date:
Tom Lane wrote:
> What I personally wish we could do is eliminate database encoding as
> a separate setting altogether, and drive it off the locale selection.
> I don't know how to do that though.

The information is available:

$ LANG=de_DE locale charmap
ISO-8859-1
$ LANG=de_DE@euro locale charmap
ISO-8859-15
$ LANG=de_DE.utf8 locale charmap
UTF-8

But the answer space is infinite:

$ LANG=C locale charmap
ANSI_X3.4-1968

I suspect Japanese users will also have a problem with this mechanism,
but at least we could keep -E to override the automatic selection.


Re: Show encoding in initdb messages

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> But the answer space is infinite:

> $ LANG=C locale charmap
> ANSI_X3.4-1968

Right, the hard part is mapping whatever weird string "locale charmap"
chooses to return into one of the encodings our code knows about.

HPUX seems to be just arbitrarily bizarre:

$ LC_ALL=en_US.iso88591 locale charmap
"iso88591.cm"
$ LC_ALL=C.utf8 locale charmap
"utf8.cm"
$

As near as I can tell, the .cm is present in all the possible results;
and why the quotes?

> I suspect Japanese users will also have a problem with this mechanism,
> but at least we could keep -E to override the automatic selection.

Perhaps we could try to derive a setting from locale charmap, but barf
and require explicit -E if we can't recognize it?

            regards, tom lane

Re: Show encoding in initdb messages

From
Andrew Dunstan
Date:
Tom Lane wrote:

>Peter Eisentraut <peter_e@gmx.net> writes:
>
>
>>But the answer space is infinite:
>>
>>
>
>
>
>>$ LANG=C locale charmap
>>ANSI_X3.4-1968
>>
>>
>
>Right, the hard part is mapping whatever weird string "locale charmap"
>chooses to return into one of the encodings our code knows about.
>
>HPUX seems to be just arbitrarily bizarre:
>
>$ LC_ALL=en_US.iso88591 locale charmap
>"iso88591.cm"
>$ LC_ALL=C.utf8 locale charmap
>"utf8.cm"
>$
>
>As near as I can tell, the .cm is present in all the possible results;
>and why the quotes?
>
>


>
>
>>I suspect Japanese users will also have a problem with this mechanism,
>>but at least we could keep -E to override the automatic selection.
>>
>>
>
>Perhaps we could try to derive a setting from locale charmap, but barf
>and require explicit -E if we can't recognize it?
>
>
>

Sounds like an excellent plan, at least for platforms that have such a
command. Windows does not appear to :-(. Are there other *nixes that
also lack it, or that also produce bizarre results like PH-UX?

cheers

andrew

Re: Show encoding in initdb messages

From
Bruce Momjian
Date:
Is this a TODO?

---------------------------------------------------------------------------

Andrew Dunstan wrote:
> Tom Lane wrote:
>
> >Peter Eisentraut <peter_e@gmx.net> writes:
> >
> >
> >>But the answer space is infinite:
> >>
> >>
> >
> >
> >
> >>$ LANG=C locale charmap
> >>ANSI_X3.4-1968
> >>
> >>
> >
> >Right, the hard part is mapping whatever weird string "locale charmap"
> >chooses to return into one of the encodings our code knows about.
> >
> >HPUX seems to be just arbitrarily bizarre:
> >
> >$ LC_ALL=en_US.iso88591 locale charmap
> >"iso88591.cm"
> >$ LC_ALL=C.utf8 locale charmap
> >"utf8.cm"
> >$
> >
> >As near as I can tell, the .cm is present in all the possible results;
> >and why the quotes?
> >
> >
>
>
> >
> >
> >>I suspect Japanese users will also have a problem with this mechanism,
> >>but at least we could keep -E to override the automatic selection.
> >>
> >>
> >
> >Perhaps we could try to derive a setting from locale charmap, but barf
> >and require explicit -E if we can't recognize it?
> >
> >
> >
>
> Sounds like an excellent plan, at least for platforms that have such a
> command. Windows does not appear to :-(. Are there other *nixes that
> also lack it, or that also produce bizarre results like PH-UX?
>
> cheers
>
> andrew
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073