Re: parallel workers and client encoding - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: parallel workers and client encoding
Date
Msg-id 20160610.081642.1227492324497708360.t-ishii@sraoss.co.jp
Whole thread Raw
In response to parallel workers and client encoding  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: parallel workers and client encoding  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
> There appears to be a problem with how client encoding is handled in
> the communication from parallel workers.

Ouch.

>  In a parallel worker, the
> client encoding setting is inherited from its creating process as part
> of the GUC setup.  So any plain-text stuff the parallel worker sends
> to its leader is actually converted to the client encoding.  Since
> most data is sent in binary format, the plain-text provision applies
> mainly to notice and error messages.  At the other end, error messages
> are parsed using pq_parse_errornotice(), which internally uses
> routines that were meant for communication from the client, and
> therefore will convert everything back from the client encoding to the
> server encoding.  So this whole thing actually happens to work as long
> as round tripping is possible between the involved encodings.
>
> In cases where it isn't, it's still hard to notice the difference
> because depending on whether you get a parallel plan or not, the
> following happens:
>
> not parallel: conversion error happens between server and client,
> client sees an error message about that
>
> parallel: conversion error happens between worker and leader, worker
> generates an error message about that, sends it to leader, leader
> forwards it to client
>
> The client sees the same error message in both cases.
>
> To construct a case where this makes a difference, the leader has to
> be set up to catch certain errors.  Here is an example:
>
> """
> create table test1 (a int, b text);
> truncate test1;
> insert into test1 values (1, 'a');
>
> create or replace function test1() returns text language plpgsql
> as $$
> declare
>   res text;
> begin
>   perform from test1 where a = test2();
>   return res;
> exception when division_by_zero then
>   return 'boom';
> end;
> $$;
>
> create or replace function test2() returns int language plpgsql
> parallel safe
> as $$
> begin
>   raise division_by_zero using message = 'Motörhead';
>   return 1;
> end
> $$;
>
> set force_parallel_mode to on;
>
> select test1();
> """
>
> With client_encoding = server_encoding, this will return a single row
> 'boom'.  But with, say, database encoding UTF8 and
> PGCLIENTENCODING=KOI8R, it will error:
>
> ERROR: 22P05: character with byte sequence 0xef 0xbe 0x83 in encoding
> "UTF8" has no equivalent in encoding "KOI8R"
> CONTEXT:  parallel worker
>
> (Note that changing force_parallel_mode does not force replanning in
> plpgsql, so if you run test1() first before setting
> force_parallel_mode, then you won't get the error.)
>
> Attached is a patch to illustrates how this could be fixed.  There
> might be similar issues elsewhere.  The notification propagation in
> particular could be affected.

Something like SetClientEncoding(GetDatabaseEncoding()) is a Little
bit ugly. It would be nice if we could have a switch to turn off the
automatic encoding conversion in the future, but for 9.6, I feel I'm
fine with your proposed patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Next
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Don't generate parallel paths for rels with parallel-restricted