Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail - Mailing list pgsql-bugs

From Jan Lentfer
Subject Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail
Date
Msg-id 666abab3fe03561c77dbfc661b023985@neslonek.homeunix.org
Whole thread Raw
Responses Re: Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail
List pgsql-bugs
We found a problem with comments on databases using german umlauts (=C3=A4=
=C3=B6=C3=BC)=20
- or more generally, we found encoding problems when doing "COMMENT ON=20
database foo ..." when foo is UTF-8 encoded and the database you are=20
connected to (e.g. postgres) is SQL_ASCII.
I analyzed this with the help of Andrew Gierth (and others) on IRC - it=20
seems in that constellation you can write an non-UTF-8 comment onto a=20
UTF-8 database.
This leads to the problem, that wen trying to use pg_restore -C with a=20
dump created with -Fc the restore will fail when trying to do the=20
"COMMENT ON DATABASE.."
We stumpled across this because also pg_upgrade fails on this (during=20
the schema part), but doesn't detect this situation with the -c (check)=20
option beforehand.
This is 9.4.5 on Solaris 11 - but we did have that problem already when=20
using pg_upgrade from 8.4 to 9.1 a few years back. I am sorry we didn't=20
analyze further back then, but it was "just a comment". But now it bit=20
us again....

Here is what I did so far to analyze the problem (with instrucions from=20
Andrew Gierth). I hope this makes it clear, otherwise please don't=20
hesitate to request more details.


postgres=3D# create database comment_test template template0 encoding=20
'UTF-8';
CREATE DATABASE
#####

---> use pgadmin3 on Windows to set comment on database coment_test with=20
string "f=C3=BCr", while being connected to postgres (SQL_ASCII encoded)
#####

$ pg_dump -Fc comment_test -f comment_test.pgdump
postgres=3D# alter database comment_test rename to comment_test_orig;
ALTER DATABASE
#####

$ pg_restore -C -d template1 comment_test.pgdump
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1989; 1262 641528=20
COMMENT comment_test postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  invalid=20
byte sequence for encoding "UTF8": 0xfc
     Command was: COMMENT ON DATABASE comment_test IS 'f=C2=A6r';
###

postgres@sz-pp-pg02-dev[~] $ pg_restore comment_test.pgdump
--
-- PostgreSQL database dump
--

SET statement_timeout =3D 0;
SET lock_timeout =3D 0;
SET client_encoding =3D 'UTF8';
SET standard_conforming_strings =3D on;
SET check_function_bodies =3D false;
SET client_min_messages =3D warning;

--
-- Name: comment_test; Type: COMMENT; Schema: -; Owner: postgres
--

COMMENT ON DATABASE comment_test IS 'f=C2=A6r';


--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner:
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


--
-- Name: public; Type: ACL; Schema: -; Owner: postgres
--

REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM postgres;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO PUBLIC;


--
-- PostgreSQL database dump complete
--
#######

  $ pg_restore comment_test.pgdump | head -16 | tail -1 |xxd
0000000: 434f 4d4d 454e 5420 4f4e 2044 4154 4142  COMMENT ON DATAB
0000010: 4153 4520 636f 6d6d 656e 745f 7465 7374  ASE comment_test
0000020: 2049 5320 2766 fc72 273b 0a               IS 'f.r';.
######

postgres=3D# select=20
convert_to(shobj_description(oid,'pg_database'),'SQL_ASCII') from=20
pg_database where datname=3D'comment_test_orig';
  convert_to
------------
  \x66fc72
(1 row)
######


postgres=3D# \l
                                             List of databases
         Name        |   Owner   | Encoding  |     Collate      |     =20
Ctype       |   Access privileges
--------------------+-----------+-----------+------------------+---------=
---------+-----------------------
  comment_test       | postgres  | UTF8      | C                | C      =
=20
          |
  [...]
  postgres           | postgres  | SQL_ASCII | C                | C      =
=20
          |
  template0          | postgres  | SQL_ASCII | C                | C      =
=20
          | =3Dc/postgres          +
                     |           |           |                  |        =
=20
          | postgres=3DCTc/postgres
  template1          | postgres  | SQL_ASCII | C                | C      =
=20
          | postgres=3DCTc/postgres+
                     |           |           |                  |        =
=20
          | =3Dc/postgres

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13888: pg_dump write error
Next
From: Dennis Kögel
Date:
Subject: Parallel pg_restore fails to import views with triggers (dependency problem?)