Protocol TODO: Identify server charset in handshake - Mailing list pgsql-hackers

From Craig Ringer
Subject Protocol TODO: Identify server charset in handshake
Date
Msg-id 541A5659.5050006@2ndquadrant.com
Whole thread Raw
List pgsql-hackers
Hi all

In the wire protocol, currently if you get an error from the server
before you know it's processed your startup packet successfully you
don't know what character encoding that error is in.

If the error came from the postmaster then it's in the system's default
encoding or whatever locale the postmaster was started under.

If the error came from the DB backend it's in the DB backend's default
encoding, switched to during backend startup. Assuming it got that far.

If the error came after client_encoding was applied, it's in your
requested client_encoding.

This leaves the client unable to reliably interpret error messages. The
4.1 protocol should probably explicitly signal the encoding in the first
message from the server, and thereafter whenever it changes.

(This is somewhat related to the mess we make of text encodings in the
log files, where the postmaster writes in one encoding and DB backends
write in another).


Example psql session, in a terminal with en_AU.UTF-8 locale, connecting
to a postmaster started with:



$ LC_ALL=ru_RU.ISO-8859-5 LANG=ru_RU.ISO-8859-5 PATH=$HOME/pg/pg94/bin
postgres -D pg94_ru -p 9595

$ locale
LANG=en_AU.UTF-8
LC_CTYPE="en_AU.UTF-8"
...
LC_ALL=

$ psql -p 9595
psql: �����:  ���� ������ "craig" �� ����������

$ psql -q -p 9595 postgres
postgres=# \c nosuch
�����:  ���� ������ "nosuch" �� ����������
Previous connection kept
postgres=# select indb_error();
ОШИБКА:  функция indb_error() не существует
LINE 1: select indb_error();              ^
HINT:  Функция с данными именем и типами аргументов не найдена.
Возможно, вам следует добавить явные преобразования типов.



Note the garbage where psql happily dumps an ISO-8859-5 message to the
terminal because it has no way of knowing it isn't in the current
client_encoding, and no way of telling what encoding it is anyway.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [Windows,PATCH] Use faster, higher precision timer API
Next
From: Oleg Bartunov
Date:
Subject: Re: Collations and Replication; Next Steps