Thread: set_client_encoding is broken
If you look on gothic_moth and comet_moth http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=gothic_moth&dt=2009-08-30%2020:06:00 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2009-08-29%2021:06:00 you can see following error: ../../src/test/regress/pg_regress --inputdir=. --psqldir=/zfs_data/home/postgres/buildfarm/gothic_moth/HEAD/inst/bin --dbname=contrib_regression --multibyte=UTF8 --no-locale unaccent (using postmaster on Unix socket, default port) ============== dropping database "contrib_regression" ============== psql: FATAL: invalid value for parameter "client_encoding": "UTF8" command failed: "/zfs_data/home/postgres/buildfarm/gothic_moth/HEAD/inst/bin/psql" -X -c "DROP DATABASE IF EXISTS \"contrib_regression\"""postgres" gmake[1]: *** [installcheck] Error 2 gmake[1]: Leaving directory `/zfs_data/home/postgres/buildfarm/gothic_moth/HEAD/pgsql.4092/contrib/unaccent' [4a9ae815.696e:1] LOG: connection received: host=[local] [4a9ae815.696e:2] LOG: connection authorized: user=postgres database=postgres [4a9ae815.696e:3] LOG: conversion between UTF8 and LATIN2 is not supported [4a9ae815.696e:4] FATAL: invalid value for parameter "client_encoding": "UTF8" The assign_client_encoding->SetClientEncoding fails to find conversion function. http://doxygen.postgresql.org/backend_2commands_2variable_8c.html#7f2d0624e7b7fb46644c5ce284e6479c http://doxygen.postgresql.org/mbutils_8c.html#8eeff4ecab443ba7073c426fcd4bc4d6 I guess that flat auth file patch http://archives.postgresql.org/pgsql-committers/2009-08/msg00301.php is culprint. It seems that backend does not have loaded pg_encoding yet when SetClientEncoding is processed. Zdenek
Zdenek Kotala <Zdenek.Kotala@sun.com> writes: > [4a9ae815.696e:1] LOG: connection received: host=[local] > [4a9ae815.696e:2] LOG: connection authorized: user=postgres database=postgres > [4a9ae815.696e:3] LOG: conversion between UTF8 and LATIN2 is not supported > [4a9ae815.696e:4] FATAL: invalid value for parameter "client_encoding": "UTF8" > The assign_client_encoding->SetClientEncoding fails to find conversion function. Hmm. The reason this used to work is that SetClientEncoding does no real work if it's invoked before InitializeClientEncoding. The old method of handling client_encoding in the client's startup message had the setting get processed before InitPostgres, then in InitPostgres we'd call InitializeClientEncoding within the startup transaction, and it would complete the unfinished catalog lookup. In CVS HEAD we postpone the GUC processing till after InitPostgres, but it's still outside of any transaction, so SetClientEncoding just fails. There are a number of possible solutions: 1. We could revert the changes in GUC handling, ie go back to applying non-SUSET GUCs before InitPostgres and SUSET ones afterwards. I don't much care for this; that code was really ugly, and I'm still worried about the possible security exposure of letting not-yet-authenticated users set GUCs, even ones we think are harmless. 2. Move the InitializeClientEncoding call out of InitPostgres and put it in PostgresMain after the GUC variables are all set. This is pretty bad from a performance point of view, though, since it appears to require a second startup-time transaction. 3. Push the startup-packet GUC processing (approx. lines 3340..3395 of postgres.c, as of CVS HEAD) into InitPostgres, so it can be run during the startup transaction. This is not too unclean, though it would mean exporting process_postgres_switches() from postgres.c; I guess the main thing I don't like about it is that InitPostgres has enough weird responsibilities already. I'm leaning to the third choice, but I wonder if anyone has any comments or better ideas. regards, tom lane
Tom Lane píše v po 31. 08. 2009 v 11:00 -0400: > 3. Push the startup-packet GUC processing (approx. lines 3340..3395 of > postgres.c, as of CVS HEAD) into InitPostgres, so it can be run during > the startup transaction. This is not too unclean, though it would > mean exporting process_postgres_switches() from postgres.c; I guess > the main thing I don't like about it is that InitPostgres has enough > weird responsibilities already. > > I'm leaning to the third choice, but I wonder if anyone has any > comments > or better ideas. It seems to me that 3 is OK. Another possibility is that InitPostgres can only fill up rel cache and GUC processing can stay on the same place. But in general, this problem can affect any other GUC variable which has assign hook and needs to lookup. I don't know how it works before, but I'm afraid that user can get error message in server encoding before it is correctly set. Zdenek
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > Tom Lane píše v po 31. 08. 2009 v 11:00 -0400: >> I'm leaning to the third choice, but I wonder if anyone has any >> comments or better ideas. > It seems to me that 3 is OK. > Another possibility is that InitPostgres can only fill up rel cache and > GUC processing can stay on the same place. But in general, this problem > can affect any other GUC variable which has assign hook and needs to > lookup. Yeah, if it was *only* client_encoding then I wouldn't mind a hack solution too much, but search_path is similarly affected and it's not hard to foresee other GUCs in future that might require catalog access. So I'd prefer a reasonably clean solution. > I don't know how it works before, but I'm afraid that user can get error > message in server encoding before it is correctly set. It's always been the case that messages could come out before we can set client_encoding. I believe we have things set up so that you'll get the untranslated, plain-ASCII-English message in that situation. Feel free to test ;-) regards, tom lane