Thread: client_encoding directive is ignored in postgresql.conf
There is a nasty bug with the client_encoding directive in postgresql.conf. It is simply ignored. This bug exists in both 7.3 or later and in current. Interesting thing is "show client_encoding" command shows expected encoding but this only shows the GUC internal variable and the actual internal encoding state does not reflect postgresql.conf. The cause of the bug is as follows: 1) postgresql.conf is read by GUC while postmaster starting up. 2) assign_client_encoding() is called to reflect client_encoding directive in postgresq.conf 3) Since database access is not available during GUC processing, it just set GUC variable and leave the real client encodingstate as it is. Possible solution would be setting a flag during 2) and doing actual client encoding processing after the database access becomes possible. Included patches (against 7.3 stable tree) implement this. New function InitializeClientEncoding() does client encoding things and is called between call to InitializeSearchPath() and on_shmem_exit() in InitPostgres(). Comments? P.S. Related bug exists in fe-connect.c:PostgresPollingStatusType(). In this function to set the client side encoding, getdatabaseencoding() is called. This is simply wrong since it does not return the client encoding specified by postgressql.conf's client_encoding directive. Instead pg_client_encoding() should be called. (fix to this is included in the patches) -- Tatsuo Ishii ------------------------------------------------------------------------------- Index: backend/utils/init/postinit.c =================================================================== RCS file: /cvsroot/pgsql-server/src/backend/utils/init/postinit.c,v retrieving revision 1.117.2.1 diff -c -r1.117.2.1 postinit.c *** backend/utils/init/postinit.c 21 Nov 2002 06:36:27 -0000 1.117.2.1 --- backend/utils/init/postinit.c 29 Jan 2003 13:13:04 -0000 *************** *** 397,402 **** --- 397,405 ---- /* set default namespace search path */ InitializeSearchPath(); + /* initialize client encoding */ + InitializeClientEncoding(); + /* * Set up process-exit callback to do pre-shutdown cleanup. This * should be last because we want shmem_exitto call this routine Index: backend/utils/mb/mbutils.c =================================================================== RCS file: /cvsroot/pgsql-server/src/backend/utils/mb/mbutils.c,v retrieving revision 1.36.2.1 diff -c -r1.36.2.1 mbutils.c *** backend/utils/mb/mbutils.c 26 Nov 2002 02:37:13 -0000 1.36.2.1 --- backend/utils/mb/mbutils.c 29 Jan 2003 13:13:05 -0000 *************** *** 37,42 **** --- 37,44 ---- int len, bool is_client_to_server); static int cliplen(const unsignedchar *str, int len, int limit); + /* Flag to we need to initialize client encoding info */ + static bool need_to_init_client_encoding = -1; /* * Set the client encoding and save fmgrinfo for the converion *************** *** 58,63 **** --- 60,72 ---- if (!PG_VALID_FE_ENCODING(encoding)) return (-1); + /* If we cannot actualy set client encoding info, remeber it + * so that we could set it using InitializeClientEncoding() + * in InitPostgres() + */ + if (current_server_encoding != encoding && !IsTransactionState()) + need_to_init_client_encoding = encoding; + if (current_server_encoding == encoding || (current_server_encoding == PG_SQL_ASCII || encoding == PG_SQL_ASCII)) { *************** *** 113,118 **** --- 122,140 ---- ToClientConvProc = to_client; } return 0; + } + + /* Initialize client encoding if necessary. + * called from InitPostgres() once during backend starting up. + */ + void + InitializeClientEncoding() + { + if (need_to_init_client_encoding > 0) + { + SetClientEncoding(need_to_init_client_encoding, 1); + need_to_init_client_encoding = -1; + } } /* Index: include/mb/pg_wchar.h =================================================================== RCS file: /cvsroot/pgsql-server/src/include/mb/pg_wchar.h,v retrieving revision 1.44 diff -c -r1.44 pg_wchar.h *** include/mb/pg_wchar.h 4 Sep 2002 20:31:42 -0000 1.44 --- include/mb/pg_wchar.h 29 Jan 2003 13:13:25 -0000 *************** *** 295,300 **** --- 295,301 ---- extern void SetDefaultClientEncoding(void); extern int SetClientEncoding(int encoding, bool doit); + extern void InitializeClientEncoding(void); extern int pg_get_client_encoding(void); extern const char *pg_get_client_encoding_name(void); Index: interfaces/libpq/fe-connect.c =================================================================== RCS file: /cvsroot/pgsql-server/src/interfaces/libpq/fe-connect.c,v retrieving revision 1.213.2.1 diff -c -r1.213.2.1 fe-connect.c *** interfaces/libpq/fe-connect.c 8 Jan 2003 21:33:53 -0000 1.213.2.1 --- interfaces/libpq/fe-connect.c 29 Jan 2003 13:13:27 -0000 *************** *** 1621,1627 **** * for it. We must use begin/commit in case autocommit * isoff by default. */ ! if (!PQsendQuery(conn, "begin; select getdatabaseencoding(); commit")) gotoerror_return; conn->setenv_state = SETENV_STATE_ENCODINGS_WAIT; --- 1621,1627 ---- * for it. We must use begin/commit in case autocommit * isoff by default. */ ! if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit")) gotoerror_return; conn->setenv_state = SETENV_STATE_ENCODINGS_WAIT;
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > + /* Flag to we need to initialize client encoding info */ > + static bool need_to_init_client_encoding = -1; Surely that should be int, not bool. > ! if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit")) Doesn't this break compatibility with pre-7.2 databases? AFAICT that function was introduced in 7.2. regards, tom lane
> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > + /* Flag to we need to initialize client encoding info */ > > + static bool need_to_init_client_encoding = -1; > > Surely that should be int, not bool. Oops. Will fix. > > ! if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit")) > > Doesn't this break compatibility with pre-7.2 databases? AFAICT that > function was introduced in 7.2. Yes, but there seems no other way to solve the problem and I thought we do not guarantee the compatibilty between 7.3 frontend and pre 7.3 servers. -- Tatsuo Ishii
Tom Lane wrote: > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > + /* Flag to we need to initialize client encoding info */ > > + static bool need_to_init_client_encoding = -1; > > Surely that should be int, not bool. > > > ! if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit")) > > Doesn't this break compatibility with pre-7.2 databases? AFAICT that > function was introduced in 7.2. I haven't seen this patch applied. If this is the best way to fix the bug, we may as well break libpq for pre-7.2 clients. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tatsuo Ishii wrote: > > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > > + /* Flag to we need to initialize client encoding info */ > > > + static bool need_to_init_client_encoding = -1; > > > > Surely that should be int, not bool. > > Oops. Will fix. > > > > ! if (!PQsendQuery(conn, "begin; select pg_client_encoding(); commit")) > > > > Doesn't this break compatibility with pre-7.2 databases? AFAICT that > > function was introduced in 7.2. > > Yes, but there seems no other way to solve the problem and I thought we > do not guarantee the compatibilty between 7.3 frontend and pre 7.3 servers. Yep. Tatsuo, you should apply the patch to fix the problem. Shame this didn't make it into 7.3.2. Should we backpatch? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Yep. Tatsuo, you should apply the patch to fix the problem. Shame this > didn't make it into 7.3.2. Should we backpatch? No. I'm not happy that we're breaking libpq for pre-7.2 servers, and I definitely don't want to see it done in 7.3. That's way too short notice. Especially it's not something to spring on people in a dot-release. Put it in 7.4 with a fat compatibility warning in the release notes. regards, tom lane
OK, can we better document that GUC client_encoding is broken, then fix in 7.4? --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Yep. Tatsuo, you should apply the patch to fix the problem. Shame this > > didn't make it into 7.3.2. Should we backpatch? > > No. I'm not happy that we're breaking libpq for pre-7.2 servers, and > I definitely don't want to see it done in 7.3. That's way too short > notice. Especially it's not something to spring on people in a > dot-release. Put it in 7.4 with a fat compatibility warning in the > release notes. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> OK, can we better document that GUC client_encoding is broken, then fix > in 7.4? Actually the problem can be divided into two parts: 1) backend does not process GUC client_encoding. 2) libpq does not ask the backend's client_encoding, instead it asks datanbase encoding when it starts up the connection.This is just a mistake. I think we could fix 1) without any backward compatibilty problem and should be applied to both 7.3-STATBLE and current. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > Actually the problem can be divided into two parts: > 1) backend does not process GUC client_encoding. > 2) libpq does not ask the backend's client_encoding, instead it asks > datanbase encoding when it starts up the connection. This is just a > mistake. > I think we could fix 1) without any backward compatibilty problem and > should be applied to both 7.3-STATBLE and current. If we change the backend behavior without changing libpq, aren't we breaking things even worse? As long as libpq behaves as in (2), hadn't the backend better init its idea of client_encoding to match database_encoding? regards, tom lane
> > Actually the problem can be divided into two parts: > > 1) backend does not process GUC client_encoding. > > 2) libpq does not ask the backend's client_encoding, instead it asks > > datanbase encoding when it starts up the connection. This is just a > > mistake. > > > I think we could fix 1) without any backward compatibilty problem and > > should be applied to both 7.3-STATBLE and current. > > If we change the backend behavior without changing libpq, aren't we > breaking things even worse? As long as libpq behaves as in (2), hadn't > the backend better init its idea of client_encoding to match > database_encoding? Why? No matter how the backend's behavior regarding client_encoding changes, libpq won't be affected by it since 7.2 and 7.3 libpq does not use client_encoding anyway. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > Why? No matter how the backend's behavior regarding client_encoding > changes, libpq won't be affected by it since 7.2 and 7.3 libpq does > not use client_encoding anyway. Doesn't client_encoding determine what encoding the backend sends to the client? It's probable that libpq itself is not affected by the selected client_encoding, since it only passes data through. But I think that applications are quite likely to be broken if we alter the backend's behavior in a way that changes the default client encoding. That strikes me as a change that should be made at a major release, not a minor one --- people don't expect to get hit by compatibility problems when they do a minor upgrade. But this argument is mostly irrelevant if the proposed change will not affect behavior in a default installation. I guess I'm not entirely clear on exactly which cases it will affect. What will your proposed change do in each possible combination (database encoding is SQL_ASCII or not, client_encoding is defined in postgresql.conf or not, PGCLIENTENCODING is set in postmaster's environment or not, etc)? regards, tom lane
> But this argument is mostly irrelevant if the proposed change will not > affect behavior in a default installation. I guess I'm not entirely > clear on exactly which cases it will affect. What will your proposed > change do in each possible combination (database encoding is SQL_ASCII > or not, client_encoding is defined in postgresql.conf or not, > PGCLIENTENCODING is set in postmaster's environment or not, etc)? The database encoding is set to the encoding when the database was created and the default value of the client encoding is set to same as the database encoding. This behavior will not be changed by the change I proposed. Anyway I will list up none default behaviors. case 1: PGCLIENTENCODING environment variable is set when postmasterstarts up case 2: -c 'client_encoding=some_encoding is set when postmasterstarts up case 3: PGOPTIONS environment variable is set frontend starts up case 4: GUC variable client_encoding is set current behavior: show client_encoding shows the encoding set by PGCLIENTENCODING/postmaster opton/PGOPTIONS/GUC variablebut actual client encoding is same as the database encoding. Thus pg_client_encoding() returns the databaseencoding. Also client/database encoding conversion is not peformed(bug). After changes: show client_encoding shows the encoding set by PGCLIENTENCODING/postmaster opton/PGOPTIONS/GUC variable.pg_client_encoding() returns the client encoding as expected. Client/database encoding conversion will be peformed. case 5: PGCLIENTENCODING environment variable is set frontend startsup case 6: SET client_encoding command is issued current behavior: show client_encoding shows the encoding set by PGCLIENTENCODING/SET command. pg_client_encoding()returns the client encoding as expected. Client/database encoding conversion will be peformed. After changes: same as above. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > The database encoding is set to the encoding when the database was > created and the default value of the client encoding is set to same as > the database encoding. This behavior will not be changed by the change > I proposed. As long as it still behaves that way by default, I guess we won't create any surprises. Okay, I withdraw my objection. regards, tom lane