Re: Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding
Date
Msg-id CA+Tgmoa=3Rrbv11gOtO+TV8+LyPRjF9C9+LJ5eiYmhXbnfVo2A@mail.gmail.com
Whole thread Raw
In response to Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding  (Alexander Law <exclusion@gmail.com>)
List pgsql-hackers
On Wed, Jul 25, 2012 at 7:54 AM, Alexander Law <exclusion@gmail.com> wrote:
> Hello,
> I would like to fix this bug, but it looks like it would be not one-line
> patch.
> Looking at the pg_dump code I see that the object names come through the
> following chain:
> 1. pg_dump executes 'SELECT c.tableoid, c.oid, c.relname, ... ' and gets the
> object_name with the encoding chosen for db connection/dump.
> 2. it invokes write_msg function or alike:
>
>     write_msg(NULL, "finding the columns and types of table \"%s\"\n",
> tbinfo->dobj.name);
> 3. vwrite_msg localizes text message, but not the argument(s):
>     vfprintf(stderr, _(fmt), ap);
> Here gettext (_) internally translates fmt to OS encoding (if it's different
> from UTF-8 - encoding of a localized strings).
>
> And I can see only a few solutions of the problem:
> 1. To convert the object name at the back-end, i.e. to modify all the
> similar SELECT's as:
> 'SELECT c.tableoid, c.oid, c.relname, convert_to(c.relname, 'OS_ENCODING')
> AS locrelname, ...'
> and then do     write_msg(NULL, "finding the columns and types of table
> \"%s\"\n", tbinfo->dobj.local_name);
> The downside of this approach is that it requires rewriting all the SELECT's
> for all the object. And it doesn't help us to write out any other text from
> backend, such as localized backend error.
>
> 2. To setup another connection to backend with the OS encoding, and to get
> all the object names through it. It looks insane too. And we have the same
> problem with the localized backend errors coming on "main" connection.
>
> 3. To make convert_to_os_encoding(text, encoding) function for a frontend
> utilities. Unfortunately frontend can't use internal PostgreSQL conversion
> functions, and modifying them to use through libpq looks unfeasible.
> So the only way to implement such function is to use another encoding
> conversion framework (library).
> And my question is - is it possible to include libiconv (add this
> dependency) to the frontend utilities code?
>
> 4. To force users to use OS encoding as the Database encoding. Or to not use
> non-ASCII characters in an db object names and to disable nls on Windows
> completely. It doesn't look like a solution at all.

I think if you're going to try to do something about this, #1 is
probably the best option.

It does sound like a lot of work, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: [PATCH] Docs: Make notes on sequences and rollback more obvious
Next
From: Tom Lane
Date:
Subject: Re: WIP patch for LATERAL subqueries