Thread: current breakage with PGCLIENTENCODING

current breakage with PGCLIENTENCODING

From
Tatsuo Ishii
Date:
It seems current does not handle PGCLIENTENCODING environment variable
correctly.

$ PGCLIENTENCODING=SJIS psql test
Welcome to psql 7.4devel, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help on internal slash commands
    \g or terminate with semicolon to execute query      \q to quit
 

Pager is off.
test=# \encoding
SJIS
test=# show client_encoding;client_encoding 
-----------------SJIS
(1 row)

test=# select pg_client_encoding();pg_client_encoding 
--------------------EUC_JP
(1 row)

As you can see, the result of "show client_encoding;" and "select
pg_client_encoding();" do not match. I'm sure that they do match at
least in the 2003/03/24 version.

Any idea?
--
Tatsuo Ishii



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> As you can see, the result of "show client_encoding;" and "select
> pg_client_encoding();" do not match.

Weird, it works fine for me:

$ PGCLIENTENCODING=SJIS psql regression
Welcome to psql 7.4devel, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help on internal slash commands
    \g or terminate with semicolon to execute query      \q to quit
 

regression=# \encoding
SJIS
regression=# show client_encoding;client_encoding
-----------------SJIS
(1 row)

regression=# select pg_client_encoding();pg_client_encoding
--------------------SJIS
(1 row)

I would not have been real surprised to hear that psql's \encoding is
out of sync, but it *does* surprise me that "show client_encoding" might
not match pg_client_encoding().  I would think those are looking at the
same backend state variable.  Any theory how that could happen?
        regards, tom lane



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
I said:
> I would not have been real surprised to hear that psql's \encoding is
> out of sync, but it *does* surprise me that "show client_encoding" might
> not match pg_client_encoding().  I would think those are looking at the
> same backend state variable.  Any theory how that could happen?

It occurs to me that the CVS-tip code tries to set client_encoding much
earlier in backend startup than was the case when this was driven by
a SET command issued by libpq after backend startup completed.  However,
it works fine for me.  Why isn't it working for you?
        regards, tom lane



Re: current breakage with PGCLIENTENCODING

From
Tatsuo Ishii
Date:
> > I would not have been real surprised to hear that psql's \encoding is
> > out of sync, but it *does* surprise me that "show client_encoding" might
> > not match pg_client_encoding().  I would think those are looking at the
> > same backend state variable.  Any theory how that could happen?
> 
> It occurs to me that the CVS-tip code tries to set client_encoding much
> earlier in backend startup than was the case when this was driven by
> a SET command issued by libpq after backend startup completed.  However,
> it works fine for me.  Why isn't it working for you?

I guess that's because your database encoding is SQL_ASCII. Could you
try with an EUC_JP encoded database?
--
Tatsuo Ishii



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> I guess that's because your database encoding is SQL_ASCII. Could you
> try with an EUC_JP encoded database?

I get the same results with UNICODE or EUC_JP databases:
pg_client_encoding() reports SJIS as expected.

It does look like something's broken though, because some of the
src/test/mb tests fail:

$ sh mbregress.sh
dropdb: database removal failed: ERROR:  DROP DATABASE: database "unitest" does
not exist
CREATE DATABASE
euc_jp ..  ok
sjis ..  failed
euc_kr ..  ok
euc_cn ..  ok
euc_tw ..  ok
big5 ..  failed
unicode ..  ok
mule_internal ..  ok
$

The full diffs are attached --- do they make any sense to you?  At least
some of the changes are intentional: the parser error position is now
counted in characters not bytes.

            regards, tom lane


Attachment

Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
Tatsuo, do you perhaps have a setting for client_encoding somewhere in
your postgresql.conf (uncommented) or in per-user or per-database GUC
settings?
        regards, tom lane



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
After staring at this for awhile, I've made some changes in
SetClientEncoding to clean up corner cases (it wasn't honoring DoIt in
all cases, and not being completely consistent about when to set
need_to_init_client_encoding either).  I am not sure whether this
solves your problem or not, though --- would you check?

Also, I think that the src/test/mb failures are unrelated.  Perhaps we
just need to update the expected results there?  Or are there real
problems?
        regards, tom lane



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
I said:
> Also, I think that the src/test/mb failures are unrelated.  Perhaps we
> just need to update the expected results there?  Or are there real
> problems?

Oh ... I'm an idiot.  CVS tip isn't applying any encoding conversion
to incoming query strings.  I forgot a step in rearranging the handling
of incoming messages :-(.  Will fix shortly.
        regards, tom lane



Re: current breakage with PGCLIENTENCODING

From
Tatsuo Ishii
Date:
> Oh ... I'm an idiot.  CVS tip isn't applying any encoding conversion
> to incoming query strings.  I forgot a step in rearranging the handling
> of incoming messages :-(.  Will fix shortly.

Ok.

BTW, I have digged into the problem which pg_client_encoding() returns
EUC_JP even psql starts up with PGCLIENTENCODING=SJIS. Weired thing is
when I created a fresh DB cluster pg_client_encoding() returns
SJIS. Of course both postgresql.conf are identical. I debugged with
gdb and found that the cause was SearchCatCacheList() could not get
the syscache entry for pg_conversion. Tom, do you have any idea for
this?

#0  SearchCatCacheList (cache=0x82a4198, nkeys=3, v1=2200, v2=28, v3=1, v4=0)   at catcache.c:1475
#1  0x0818d284 in SearchSysCacheList (cacheId=11, nkeys=3, key1=2200, key2=28,    key3=1, key4=0) at syscache.c:736
#2  0x0809e7a9 in FindDefaultConversion (name_space=2200, for_encoding=28,    to_encoding=1) at pg_conversion.c:215
#3  0x0809afa3 in FindDefaultConversionProc (for_encoding=28, to_encoding=1)   at namespace.c:1431
#4  0x0819ff47 in SetClientEncoding (encoding=28, doit=1 '\001')   at mbutils.c:83
#5  0x081a0085 in InitializeClientEncoding () at mbutils.c:135
#6  0x08195208 in InitPostgres (dbname=0x826d910 "test",    username=0x8267830 "t-ishii") at postinit.c:404
#7  0x081413fe in PostgresMain (argc=6, argv=0x82678c0,    username=0x8267830 "t-ishii") at postgres.c:1812
#8  0x08123eeb in DoBackend (port=0x8273978) at postmaster.c:2450
#9  0x08123737 in BackendStartup (port=0x8273978) at postmaster.c:2069
#10 0x081222fe in ServerLoop () at postmaster.c:1031
#11 0x08121e5e in PostmasterMain (argc=3, argv=0x8265d50) at postmaster.c:811
#12 0x080f9e9f in main (argc=3, argv=0xbffff7b4) at main.c:209
--
Tatsuo Ishii



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> BTW, I have digged into the problem which pg_client_encoding() returns
> EUC_JP even psql starts up with PGCLIENTENCODING=SJIS.

Is it still there for you in CVS tip?  I cannot reproduce it:

regression=# create database euc_jp encoding 'EUC_JP';
CREATE DATABASE
regression=# \q
$ PGCLIENTENCODING=SJIS psql euc_jp
Welcome to psql 7.4devel, the PostgreSQL interactive terminal.
... yadda yadda ...

euc_jp=# \encoding
SJIS
euc_jp=# show client_encoding ;client_encoding
-----------------SJIS
(1 row)

euc_jp=# select pg_client_encoding() ;pg_client_encoding
--------------------SJIS
(1 row)


I'm still wondering if you have a postgresql.conf or environment
setting that is affecting this.
        regards, tom lane



Re: current breakage with PGCLIENTENCODING

From
Tatsuo Ishii
Date:
> I'm still wondering if you have a postgresql.conf or environment
> setting that is affecting this.

It appeared that pg_conversion was empty for unknown reason. That was
the source of the problem.
--
Tatsuo Ishii



Re: current breakage with PGCLIENTENCODING

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> I'm still wondering if you have a postgresql.conf or environment
>> setting that is affecting this.

> It appeared that pg_conversion was empty for unknown reason. That was
> the source of the problem.

Ugh.  I think I have seen that happen too --- if the CREATE CONVERSION
commands fail during initdb, initdb doesn't notice or tell you about it.
It would be a good idea to fix that.  Perhaps the standalone backend
could be extended with a command that tells it to exit with nonzero
status if it gets any error?
        regards, tom lane