Thread: set_client_encoding is broken

set_client_encoding is broken

From

Zdenek Kotala

Date:

31 August 2009, 05:48:54

If you look on gothic_moth and comet_moth

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=gothic_moth&dt=2009-08-30%2020:06:00
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2009-08-29%2021:06:00

you can see following error:

../../src/test/regress/pg_regress --inputdir=.
--psqldir=/zfs_data/home/postgres/buildfarm/gothic_moth/HEAD/inst/bin
--dbname=contrib_regression --multibyte=UTF8 --no-locale  unaccent
(using postmaster on Unix socket, default port)
============== dropping database "contrib_regression" ==============
psql: FATAL:  invalid value for parameter "client_encoding": "UTF8"
command failed: "/zfs_data/home/postgres/buildfarm/gothic_moth/HEAD/inst/bin/psql" -X -c "DROP DATABASE IF EXISTS
\"contrib_regression\"""postgres"
 
gmake[1]: *** [installcheck] Error 2
gmake[1]: Leaving directory `/zfs_data/home/postgres/buildfarm/gothic_moth/HEAD/pgsql.4092/contrib/unaccent'


[4a9ae815.696e:1] LOG:  connection received: host=[local]
[4a9ae815.696e:2] LOG:  connection authorized: user=postgres database=postgres
[4a9ae815.696e:3] LOG:  conversion between UTF8 and LATIN2 is not supported
[4a9ae815.696e:4] FATAL:  invalid value for parameter "client_encoding": "UTF8"

The assign_client_encoding->SetClientEncoding fails to find conversion function.
http://doxygen.postgresql.org/backend_2commands_2variable_8c.html#7f2d0624e7b7fb46644c5ce284e6479c
http://doxygen.postgresql.org/mbutils_8c.html#8eeff4ecab443ba7073c426fcd4bc4d6

I guess that flat auth file patch
http://archives.postgresql.org/pgsql-committers/2009-08/msg00301.php
is culprint.

It seems that backend does not have loaded pg_encoding yet when
SetClientEncoding is processed.
Zdenek

Re: set_client_encoding is broken

From

Tom Lane

Date:

31 August 2009, 12:01:05

Zdenek Kotala <Zdenek.Kotala@sun.com> writes:
> [4a9ae815.696e:1] LOG:  connection received: host=[local]
> [4a9ae815.696e:2] LOG:  connection authorized: user=postgres database=postgres
> [4a9ae815.696e:3] LOG:  conversion between UTF8 and LATIN2 is not supported
> [4a9ae815.696e:4] FATAL:  invalid value for parameter "client_encoding": "UTF8"

> The assign_client_encoding->SetClientEncoding fails to find conversion function.

Hmm.  The reason this used to work is that SetClientEncoding does
no real work if it's invoked before InitializeClientEncoding.  The
old method of handling client_encoding in the client's startup
message had the setting get processed before InitPostgres, then in
InitPostgres we'd call InitializeClientEncoding within the startup
transaction, and it would complete the unfinished catalog lookup.
In CVS HEAD we postpone the GUC processing till after InitPostgres,
but it's still outside of any transaction, so SetClientEncoding just
fails.

There are a number of possible solutions:

1. We could revert the changes in GUC handling, ie go back to applying
non-SUSET GUCs before InitPostgres and SUSET ones afterwards.  I don't
much care for this; that code was really ugly, and I'm still worried
about the possible security exposure of letting not-yet-authenticated
users set GUCs, even ones we think are harmless.

2. Move the InitializeClientEncoding call out of InitPostgres and put
it in PostgresMain after the GUC variables are all set.  This is pretty
bad from a performance point of view, though, since it appears to
require a second startup-time transaction.

3. Push the startup-packet GUC processing (approx. lines 3340..3395 of
postgres.c, as of CVS HEAD) into InitPostgres, so it can be run during
the startup transaction.  This is not too unclean, though it would
mean exporting process_postgres_switches() from postgres.c; I guess
the main thing I don't like about it is that InitPostgres has enough
weird responsibilities already.

I'm leaning to the third choice, but I wonder if anyone has any comments
or better ideas.
        regards, tom lane

Re: set_client_encoding is broken

From

Zdenek Kotala

Date:

31 August 2009, 15:19:58

Tom Lane píše v po 31. 08. 2009 v 11:00 -0400:
> 3. Push the startup-packet GUC processing (approx. lines 3340..3395 of
> postgres.c, as of CVS HEAD) into InitPostgres, so it can be run during
> the startup transaction.  This is not too unclean, though it would
> mean exporting process_postgres_switches() from postgres.c; I guess
> the main thing I don't like about it is that InitPostgres has enough
> weird responsibilities already.
> 
> I'm leaning to the third choice, but I wonder if anyone has any
> comments
> or better ideas.

It seems to me that 3 is OK.

Another possibility is that InitPostgres can only fill up rel cache and
GUC processing can stay on the same place. But in general, this problem
can affect any other GUC variable which has assign hook and needs to
lookup. 

I don't know how it works before, but I'm afraid that user can get error
message in server encoding before it is correctly set.

Zdenek

Re: set_client_encoding is broken

From

Tom Lane

Date:

31 August 2009, 16:59:50

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> Tom Lane píše v po 31. 08. 2009 v 11:00 -0400:
>> I'm leaning to the third choice, but I wonder if anyone has any
>> comments or better ideas.

> It seems to me that 3 is OK.

> Another possibility is that InitPostgres can only fill up rel cache and
> GUC processing can stay on the same place. But in general, this problem
> can affect any other GUC variable which has assign hook and needs to
> lookup. 

Yeah, if it was *only* client_encoding then I wouldn't mind a hack
solution too much, but search_path is similarly affected and it's not
hard to foresee other GUCs in future that might require catalog access.
So I'd prefer a reasonably clean solution.

> I don't know how it works before, but I'm afraid that user can get error
> message in server encoding before it is correctly set.

It's always been the case that messages could come out before we can set
client_encoding.  I believe we have things set up so that you'll get the
untranslated, plain-ASCII-English message in that situation.  Feel free
to test ;-)
        regards, tom lane