Thread: client_encoding directive is ignored in postgresql.conf

client_encoding directive is ignored in postgresql.conf

From
Tatsuo Ishii
Date:
There is a nasty bug with the client_encoding directive in
postgresql.conf. It is simply ignored. This bug exists in both 7.3 or
later and in current. Interesting thing is "show client_encoding"
command shows expected encoding but this only shows the GUC internal
variable and the actual internal encoding state does not reflect
postgresql.conf. The cause of the bug is as follows:

1) postgresql.conf is read by GUC while postmaster starting up.
2) assign_client_encoding() is called to reflect client_encoding  directive in postgresq.conf
3) Since database access is not available during GUC processing, it  just set GUC variable and leave the real client
encodingstate as  it is.
 

Possible solution would be setting a flag during 2) and doing actual
client encoding processing after the database access becomes
possible. Included patches (against 7.3 stable tree) implement
this. New function InitializeClientEncoding() does client encoding
things and is called between call to InitializeSearchPath() and
on_shmem_exit() in InitPostgres().

Comments?

P.S.    Related bug exists in
fe-connect.c:PostgresPollingStatusType(). In this function to set the
client side encoding, getdatabaseencoding() is called. This is simply
wrong since it does not return the client encoding specified by
postgressql.conf's client_encoding directive. Instead
pg_client_encoding() should be called. (fix to this is included in the
patches)
--
Tatsuo Ishii

-------------------------------------------------------------------------------
Index: backend/utils/init/postinit.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/init/postinit.c,v
retrieving revision 1.117.2.1
diff -c -r1.117.2.1 postinit.c
*** backend/utils/init/postinit.c    21 Nov 2002 06:36:27 -0000    1.117.2.1
--- backend/utils/init/postinit.c    29 Jan 2003 13:13:04 -0000
***************
*** 397,402 ****
--- 397,405 ----     /* set default namespace search path */     InitializeSearchPath(); 
+     /* initialize client encoding */
+     InitializeClientEncoding();
+      /*      * Set up process-exit callback to do pre-shutdown cleanup.  This      * should be last because we want
shmem_exitto call this routine
 
Index: backend/utils/mb/mbutils.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/mb/mbutils.c,v
retrieving revision 1.36.2.1
diff -c -r1.36.2.1 mbutils.c
*** backend/utils/mb/mbutils.c    26 Nov 2002 02:37:13 -0000    1.36.2.1
--- backend/utils/mb/mbutils.c    29 Jan 2003 13:13:05 -0000
***************
*** 37,42 ****
--- 37,44 ----                                     int len, bool is_client_to_server); static int cliplen(const
unsignedchar *str, int len, int limit); 
 
+ /* Flag to we need to initialize client encoding info */
+ static bool need_to_init_client_encoding = -1;  /*  * Set the client encoding and save fmgrinfo for the converion
***************
*** 58,63 ****
--- 60,72 ----     if (!PG_VALID_FE_ENCODING(encoding))         return (-1); 
+     /* If we cannot actualy set client encoding info, remeber it
+      * so that we could set it using InitializeClientEncoding()
+      * in InitPostgres()
+      */
+     if (current_server_encoding != encoding && !IsTransactionState())
+         need_to_init_client_encoding = encoding;
+      if (current_server_encoding == encoding ||     (current_server_encoding == PG_SQL_ASCII || encoding ==
PG_SQL_ASCII))    {
 
***************
*** 113,118 ****
--- 122,140 ----         ToClientConvProc = to_client;     }     return 0;
+ }
+ 
+ /* Initialize client encoding if necessary.
+  * called from InitPostgres() once during backend starting up.
+  */
+ void
+ InitializeClientEncoding()
+ {
+     if (need_to_init_client_encoding > 0)
+     {
+         SetClientEncoding(need_to_init_client_encoding, 1);
+         need_to_init_client_encoding = -1;
+     } }  /*
Index: include/mb/pg_wchar.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/mb/pg_wchar.h,v
retrieving revision 1.44
diff -c -r1.44 pg_wchar.h
*** include/mb/pg_wchar.h    4 Sep 2002 20:31:42 -0000    1.44
--- include/mb/pg_wchar.h    29 Jan 2003 13:13:25 -0000
***************
*** 295,300 ****
--- 295,301 ----  extern void SetDefaultClientEncoding(void); extern int    SetClientEncoding(int encoding, bool
doit);
+ extern void    InitializeClientEncoding(void); extern int    pg_get_client_encoding(void); extern const char
*pg_get_client_encoding_name(void);
 
Index: interfaces/libpq/fe-connect.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/interfaces/libpq/fe-connect.c,v
retrieving revision 1.213.2.1
diff -c -r1.213.2.1 fe-connect.c
*** interfaces/libpq/fe-connect.c    8 Jan 2003 21:33:53 -0000    1.213.2.1
--- interfaces/libpq/fe-connect.c    29 Jan 2003 13:13:27 -0000
***************
*** 1621,1627 ****                      * for it.  We must use begin/commit in case autocommit                      *
isoff by default.                      */
 
!                     if (!PQsendQuery(conn, "begin; select getdatabaseencoding(); commit"))
gotoerror_return;                      conn->setenv_state = SETENV_STATE_ENCODINGS_WAIT;
 
--- 1621,1627 ----                      * for it.  We must use begin/commit in case autocommit                      *
isoff by default.                      */
 
!                     if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit"))
gotoerror_return;                      conn->setenv_state = SETENV_STATE_ENCODINGS_WAIT;
 


Re: client_encoding directive is ignored in postgresql.conf

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> + /* Flag to we need to initialize client encoding info */
> + static bool need_to_init_client_encoding = -1;

Surely that should be int, not bool.

> !                     if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit"))

Doesn't this break compatibility with pre-7.2 databases?  AFAICT that
function was introduced in 7.2.
        regards, tom lane


Re: client_encoding directive is ignored in

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > + /* Flag to we need to initialize client encoding info */
> > + static bool need_to_init_client_encoding = -1;
> 
> Surely that should be int, not bool.

Oops. Will fix.

> > !                     if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit"))
> 
> Doesn't this break compatibility with pre-7.2 databases?  AFAICT that
> function was introduced in 7.2.

Yes, but there seems no other way to solve the problem and I thought we
do not guarantee the compatibilty between 7.3 frontend and pre 7.3 servers.
--
Tatsuo Ishii


Re: client_encoding directive is ignored in postgresql.conf

From
Bruce Momjian
Date:
Tom Lane wrote:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > + /* Flag to we need to initialize client encoding info */
> > + static bool need_to_init_client_encoding = -1;
> 
> Surely that should be int, not bool.
> 
> > !                     if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit"))
> 
> Doesn't this break compatibility with pre-7.2 databases?  AFAICT that
> function was introduced in 7.2.

I haven't seen this patch applied.  If this is the best way to fix the
bug, we may as well break libpq for pre-7.2 clients.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: client_encoding directive is ignored in

From
Bruce Momjian
Date:
Tatsuo Ishii wrote:
> > Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > > + /* Flag to we need to initialize client encoding info */
> > > + static bool need_to_init_client_encoding = -1;
> > 
> > Surely that should be int, not bool.
> 
> Oops. Will fix.
> 
> > > !                     if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit"))
> > 
> > Doesn't this break compatibility with pre-7.2 databases?  AFAICT that
> > function was introduced in 7.2.
> 
> Yes, but there seems no other way to solve the problem and I thought we
> do not guarantee the compatibilty between 7.3 frontend and pre 7.3 servers.

Yep.  Tatsuo, you should apply the patch to fix the problem.  Shame this
didn't make it into 7.3.2.  Should we backpatch?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: client_encoding directive is ignored in

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Yep.  Tatsuo, you should apply the patch to fix the problem.  Shame this
> didn't make it into 7.3.2.  Should we backpatch?

No.  I'm not happy that we're breaking libpq for pre-7.2 servers, and
I definitely don't want to see it done in 7.3.  That's way too short
notice.  Especially it's not something to spring on people in a
dot-release.  Put it in 7.4 with a fat compatibility warning in the
release notes.
        regards, tom lane


Re: client_encoding directive is ignored in

From
Bruce Momjian
Date:
OK, can we better document that GUC client_encoding is broken, then fix
in 7.4?

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Yep.  Tatsuo, you should apply the patch to fix the problem.  Shame this
> > didn't make it into 7.3.2.  Should we backpatch?
> 
> No.  I'm not happy that we're breaking libpq for pre-7.2 servers, and
> I definitely don't want to see it done in 7.3.  That's way too short
> notice.  Especially it's not something to spring on people in a
> dot-release.  Put it in 7.4 with a fat compatibility warning in the
> release notes.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: client_encoding directive is ignored in

From
Tatsuo Ishii
Date:
> OK, can we better document that GUC client_encoding is broken, then fix
> in 7.4?

Actually the problem can be divided into two parts:

1) backend does not process GUC client_encoding. 

2) libpq does not ask the backend's client_encoding, instead it asks  datanbase encoding when it starts up the
connection.This is just a  mistake.
 

I think we could fix 1) without any backward compatibilty problem and
should be applied to both 7.3-STATBLE and current.
--
Tatsuo Ishii


Re: client_encoding directive is ignored in

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Actually the problem can be divided into two parts:
> 1) backend does not process GUC client_encoding. 
> 2) libpq does not ask the backend's client_encoding, instead it asks
>    datanbase encoding when it starts up the connection. This is just a
>    mistake.

> I think we could fix 1) without any backward compatibilty problem and
> should be applied to both 7.3-STATBLE and current.

If we change the backend behavior without changing libpq, aren't we
breaking things even worse?  As long as libpq behaves as in (2), hadn't
the backend better init its idea of client_encoding to match
database_encoding?
        regards, tom lane


Re: client_encoding directive is ignored in

From
Tatsuo Ishii
Date:
> > Actually the problem can be divided into two parts:
> > 1) backend does not process GUC client_encoding. 
> > 2) libpq does not ask the backend's client_encoding, instead it asks
> >    datanbase encoding when it starts up the connection. This is just a
> >    mistake.
> 
> > I think we could fix 1) without any backward compatibilty problem and
> > should be applied to both 7.3-STATBLE and current.
> 
> If we change the backend behavior without changing libpq, aren't we
> breaking things even worse?  As long as libpq behaves as in (2), hadn't
> the backend better init its idea of client_encoding to match
> database_encoding?

Why? No matter how the backend's behavior regarding client_encoding
changes, libpq won't be affected by it since 7.2 and 7.3 libpq does
not use client_encoding anyway.
--
Tatsuo Ishii


Re: client_encoding directive is ignored in

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Why? No matter how the backend's behavior regarding client_encoding
> changes, libpq won't be affected by it since 7.2 and 7.3 libpq does
> not use client_encoding anyway.

Doesn't client_encoding determine what encoding the backend sends to
the client?

It's probable that libpq itself is not affected by the selected
client_encoding, since it only passes data through.  But I think that
applications are quite likely to be broken if we alter the backend's
behavior in a way that changes the default client encoding.  That
strikes me as a change that should be made at a major release, not
a minor one --- people don't expect to get hit by compatibility
problems when they do a minor upgrade.

But this argument is mostly irrelevant if the proposed change will not
affect behavior in a default installation.  I guess I'm not entirely
clear on exactly which cases it will affect.  What will your proposed
change do in each possible combination (database encoding is SQL_ASCII
or not, client_encoding is defined in postgresql.conf or not,
PGCLIENTENCODING is set in postmaster's environment or not, etc)?
        regards, tom lane


Re: client_encoding directive is ignored in

From
Tatsuo Ishii
Date:
> But this argument is mostly irrelevant if the proposed change will not
> affect behavior in a default installation.  I guess I'm not entirely
> clear on exactly which cases it will affect.  What will your proposed
> change do in each possible combination (database encoding is SQL_ASCII
> or not, client_encoding is defined in postgresql.conf or not,
> PGCLIENTENCODING is set in postmaster's environment or not, etc)?

The database encoding is set to the encoding when the database was
created and the default value of the client encoding is set to same as
the database encoding. This behavior will not be changed by the change
I proposed.

Anyway I will list up none default behaviors.

case 1: PGCLIENTENCODING environment variable is set when postmasterstarts up

case 2: -c 'client_encoding=some_encoding is set when postmasterstarts up

case 3: PGOPTIONS environment variable is set frontend starts up

case 4: GUC variable client_encoding is set
    current behavior: show client_encoding shows the encoding set by    PGCLIENTENCODING/postmaster opton/PGOPTIONS/GUC
variablebut    actual client encoding is same as the database encoding. Thus    pg_client_encoding() returns the
databaseencoding. Also    client/database encoding conversion is not peformed(bug).
 
    After changes: show client_encoding shows the encoding set by    PGCLIENTENCODING/postmaster opton/PGOPTIONS/GUC
variable.pg_client_encoding() returns the client encoding as    expected. Client/database encoding conversion will be
peformed.

case 5: PGCLIENTENCODING environment variable is set frontend startsup
case 6: SET client_encoding command is issued
    current behavior: show client_encoding shows the encoding set by    PGCLIENTENCODING/SET command.
pg_client_encoding()returns the    client encoding as expected. Client/database encoding conversion    will be
peformed.
    After changes: same as above.
--
Tatsuo Ishii


Re: client_encoding directive is ignored in

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> The database encoding is set to the encoding when the database was
> created and the default value of the client encoding is set to same as
> the database encoding. This behavior will not be changed by the change
> I proposed.

As long as it still behaves that way by default, I guess we won't create
any surprises.  Okay, I withdraw my objection.
        regards, tom lane